InformatIon ScIence Reference Part 5 ppsx

52 170 0
InformatIon ScIence Reference Part 5 ppsx

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

Geospatial Image Metadata Catalog Services INTRODUCT ION As earth observation continues worldwide, large volumes of remotely sensed data on the Earth’s climate and environment have been collected and archived In order to maintain the data archives efficiently and to facilitate discovery by users of desired data in the holdings, each data provider normally maintains a digital metadata catalog Some online catalogs provide services to users for searching the catalog and discovering the data they need through a well-established Application Programming Interface (API) Such services are called Catalog Services The information in the catalog is the searchable metadata that describe individual data entries in the archives Currently most Catalog Services are provided through Webbased interfaces This chapter analyses three open catalog service systems It reviews the metadata standards, catalog service conceptual schemas and protocols, and the components of catalog service specifications REV IEW of G eosp at ial Image Cat alog S er v ices 2.1 Pilot Catalog Service Systems The Federal Geographic Data Committee (FGDC) Clearinghouse is a virtual collection of digital spatial data distributed over many servers in the United States and abroad The primary intention of the Clearinghouse is to provide discovery services for digital data, allowing users to evaluate its quality through metadata Most metadata provide information on how to acquire the data; in many cases, links to the data or an order form are available online The NASA Earth Observing System ClearingHOuse (ECHO) is a clearinghouse of spatial and temporal metadata that enables the science community to exchange data and information 172 ECHO technology can provide metadata discovery services and serve as an order broker for clients and data partners All the NASA Distributed Active Archive Centers (DAACs), as data providers, generate and ingest metadata information into ECHO The Open Geospatial Consortium (OGC) has promoted standardization and interoperability among the geospatial communities In catalogue service aspect, OGC has defined the Catalog Service implementation standard (OpenGIS, 2004) and published two recommendation papers (OpenGIS, 2005a; OpenGIS 2005b) The George Mason University (GMU) CSISS Catalog service for Web (CSW) system is an OGC-compliant catalog service, which demonstrates how the earth science community can publish geospatial resources by searching pre-registered spatial and temporal metadata information In particular, the GMU CSISS CSW catalog service is based on the OpenGIS implementation standard, and the ebRIM application profile (OpenGIS, 2005) It provides users with an open and standard means to access more than 15 Terabytes global Landsat datasets 2.2 Conceptual System Architecture Since these geospatial catalog services address similar needs, it is not surprising that they have almost the same conceptual system architecture, as shown in Figure From the point of view of metadata circulation, a catalog service usually consists of three components: metadata generation and ingestion, a conceptual schema for catalog service, and a query interface for catalog service Metadata generation and ingestion is always based on applicable metadata standards, such as the Dublin Core (DCMI, 2003), Geographic information – Metadata (19115) from International Organization for Standard (ISO, 2003), Content Standard for Digital Geospatial Metadata (CSDGM) from Federal Geographic Data Com- Geospatial Image Metadata Catalog Services Figure Conceptual Architecture of Catalog Service Query Interface Catalog Service Client User Catalog Service Conceptual Schema Metadata Holdings Data Holdings mittee (FGDC, 1998), or the ECS Earth Science Information Model from National Aeronautics and Space Administration (NASA, 2006) Metadata structures, relationships and definitions, known as conceptual schemas, play a key role in catalog services They define what kind of metadata information can be provided and how the metadata are organized The conceptual schemas are closely related to those of the pre-ingested metadata information, but are not necessarily identical Catalog service conceptual schemas are always oriented toward the field of application and may be tailored to particular application profiles The query interface for a catalog service defines the necessary operations, the syntax of each operation, and the binding protocol To facilitate access and promote interoperability among catalog services, the interface definition may be kept open is based and to which the catalog service is tailored, to meet a given agency’s requirements In addition to international and national geospatial metadata standards, such as ISO 19115 and FGDC CSDGM, several agencies may have de-facto standards in their production environment, such as NASA ECS The metadata used by the FGDC Clearinghouse follows FGDC CSDGM Each affiliated catalog service site must organize their metadata information following the CSDGM standard before they join the clearinghouse The ECHO Science Metadata Conceptual Model has been developed based on the NASA Earth Observation System Data and Information Core System (EOSDIS) Science Data Model, with modifications to suit project needs GMU CSISS CSW builds up its metadata conceptual model by combining the ebRIM information model and the ECS science data model 2.3 Metadata G eneration 2.3.2 Automatic Generation of Metadata In this section, the three open catalog services identified in Section 2.2 are analyzed on the following two aspects regarding metadata generation 2.3.1 Base Metadata Standard The base metadata standard is the public geospatial metadata standard on which the catalog service As the volume of spatial datasets keeps growing, generation of metadata becomes increasingly time-consuming An automatic mechanism for generating metadata will facilitate the generation and frequent update of metadata Metadata information needs to be organized as TXT or SGML or HTML files before a node 173 Geospatial Image Metadata Catalog Services joins the FGDC clearinghouse Some metadata generation tools are available in addition to the commercial software packages These tools are advertised on the FGDC website To help the user set up a clearinghouse node easily, a software package, ISite, is provided With this software, a qualified clearinghouse node server can be set up in minutes All the ECHO metadata holdings are obtained directly from the data providers DAACs can use some ECS tools to automatically generate metadata information GMU CSISS is developing Java-based tools to automatically extract metadata information from each granule The Hierarchical Data Format (HDF), Hierarchical Data Format - Earth Observing System (HDF-EOS), GeoTIFF and NetCDF data formats are currently supported 2.4 Metadata Ingestion 2.4.1 Metadata Distribution This function deals with the physical distribution of metadata information within the catalog service The FGDC Clearinghouse is a decentralized system of servers that contain field-level metadata descriptions of available digital spatial data located on the Internet The metadata information is physically managed within the affiliated server node Even though in ECHO scenario, the metadata information is periodically generated by those distinct data centers, they are centrally managed by the ECHO operation team That is, in the design time, metadata information in ECHO is distributed; while in the run time it is managed centrally The GMU CSISS CSW maintains more than 15 Terabytes of global Landsat images All the metadata information for these images has been registered into a centralized metadata database 174 2.4.2 Ingestion Type This section examines how each catalog service ingests metadata It focuses on two aspects: remote vs local and automatic vs manual In the FGDC Clearinghouse, all the metadata information is manipulated only in the affiliated server node Remote ingestion is not supported in server nodes The ingestion has to been manually Due to a centralized metadata information, a database approach is taken Metadata ingestion in ECHO involves two steps Data centers need to upload their current metadata information remotely to a dedicated File Transfer Protocol (FTP) server, and the ECHO operation team is responsible for ingesting these metadata information into the ECHO operational system GMU CSISS CSW provides published interfaces As long as the metadata information is well organized, it can be remotely ingested into the GMU CSISS CSW metadata database All the metadata information in that database is online and ready for client’s query 2.5 Conceptual Schema We examine how the metadata conceptual schema is defined in each catalog service In each FGDC Clearinghouse collection, all the metadata information is organized according to the FGDC CSDGM The conceptual schema of FGDC Clearinghouse collection is exactly the same as that of the FGDC CSDGM In ECHO, all the metadata information collected in the NASA DAACs is based on the ECS science data model, with some modifications necessary to suit project needs GMU CSISS CSW defines its conceptual schema based on the ECS science data model combined with ISO 19115 Since GMU CSISS CSW supports metadata queries and data retrieval (through the OGC services), an ebRIM-based profile has been selected to support defining the Geospatial Image Metadata Catalog Services association between a data granule instance and applicable geospatial service instances 2.6 T ransfer Protocol A catalog service usually provides a standard, API-based interface to support the client’s query This “design-by-contract” mechanism promote third party members’ contribution to develop new query interfaces, besides those web-based query interfaces provided by the catalog server itself The backbone of the FGDC Clearinghouse is Z39.50 (ISO, 1998) This protocol was initially developed by the library community to discover bibliographic records using a standard set of attributes To guide how to implement FGDC metadata elements within a Z39.50 service, the FGDC has developed an application profile for geospatial metadata called "GEO," which provides sets of attributes, operators, and rules of implementation that suit geospatial needs In fact, the node server is a Z39.50 server, which enables FGDC query utilities to search its metadata holdings on the fly through Z39.50 protocol and GEO profile ECHO exposes the Session Manager and a limited set of the ECHO services as Web Services defined via the Web Services Description Language (WSDL) ECHO also provides two client packages, Faỗade and EchoTalk, for client developers The syntax of the communication protocol between client and ECHO is based on the Web Services Interoperability (WS-I) Basic Profile However, the semantics of the communication protocol are defined by ECHO itself Specific query syntax, in Extensible Markup Language (XML) format, has been proposed and implemented GMU CSISS CSW’s communication protocol is based on the OGC Catalog Service Implementation Specification, which specifies the interfaces and several applicable bindings for catalog services Operations, core information schema and query language encodings are included The transportation-related communication protocol follows this specification 2.7 System Distribution This section examines the physical distribution of catalog service systems The FGDC Clearinghouse has 400 worldwide registered nodes as of March 22, 2006 FGDC maintains several Web-based search interfaces to carry out distributed searches across multiple clearinghouse nodes ECHO acts as an intermediary between data partners and client partners Data partners provide information about their data holdings, and client partners develop software to access this information through ECHO Query and Order Web Service interface End users who want to search ECHO's metadata must use one of the ECHO clients Although ECHO has close connections with the DAACs and ECHO Clients, ECHO itself is not a distributed system It does not need to build a distributed search across multiple agencies and nodes at run time GMU CSISS CSW is a standalone service Like ECHO, it is not a distributed system 2.8 Review Summaries Table summarizes the results of the analysis CONC LUS ION and Di scuss ion We have reviewed three public catalog services — FGDC Clearinghouse, NASA ECHO and GMU CSISS CSW— considering the following aspects: metadata generation, metadata ingestion, catalog service conceptual schema, query protocols and system distribution This review shows how it is becoming possible to query metadata holdings through public, standard Web-based query interfaces The review results also show that the catalog service providers still must define a catalog service schema that meets their particular needs These application-oriented approaches can meet projects 175 Geospatial Image Metadata Catalog Services Tables Review summaries Evaluation Points FGDC Clearinghouse NASA ECHO GMU CSISS CSW Metadata generation – Base standard FGDC CSDGM ECS Core ECS Core/ISO 19115 Metadata generation – Generation automation manually with tools manually with tools automatically Metadata ingestion – Metadata Distribution distributed centralized centralized Metadata ingestion – Ingestion Type N/A Remotely and automatically Locally and automatically FGDC CSDGM Based on ECS Core Based on ISO 19115 and ebRIM Z39.50 and GEO profile Proprietary and based on Web Service OGC Catalog Service and HTTP binding Distributed Centralized Centralized Conceptual Schema Transfer Protocol System distribution requirements, but they will make it more difficult to create future cross-federation multi catalog services We recommend that a standard, common and discipline-oriented-metadata based schema be used for future implementations of catalog services in the same and/or related fields R eferences DCMI (2003) DCMI Metadata Terms Retrieved March 8, 2007, from http://dublincore.org/documents/dcmi-terms/ß ECHO (2005) Earth Observing System Clearinghouse Retrieved March 8, 2007, from http://www echo.eos.nasa.gov/ FGDC (1998) Content Standard for Digital Geospatial Metadata (CSDGM) Retrieved March 8, 2007, from http://fgdc.er.usgs.gov/metadata/ contstan.html FGDC (2005) FGDC Geospatial Data Clearinghouse Activity Retrieved March 8, 2007, from http://www.fgdc.gov/clearinghouse/clearinghouse.html 176 ISO (1998) ISO 23950: Information and documentation - Information retrieval (Z39.50) - Application service definition and protocol specification ISO (2003) ISO 19115: Geographic Information - Metadata LAITS (2005) LAITS OGC Catalog Service for Web - Discovery Interface Retrieved March 8, 2007, from http://geobrain.laits.gmu.edu/csw/ discovery/ NASA (2006) EOSDIS Core System Data Model, Retrieved March 8, 2007, from http://spg.gsfc nasa.gov/standards/heritage/eosdis-core-systemdata-model OpenGIS (2004) OpenGIS Catalogue Service Implementation Specification Retrieved March 8, 2007, from http://www.opengeospatial.org/ specs/?page=specs OpenGIS (2005a) OGC Recommendation Paper 04-17r1: OGC Catalogue Services- ebRIM (ISO/TS 15000-3 profile of CSW Retrieved March 8, 2007, from http://www.opengeospatial org/specs/?page=recommendation Geospatial Image Metadata Catalog Services OpenGIS (2005b) OGC Recommendation Paper 04-038r2: ISO19115/ISO19119 Application Profile for CSW 2.0 Retrieved March 8, 2007, from http://www.opengeospatial.org/specs/ ?page=recommendation key T er ms Catalog Service: A set of information, consisting of some or all of directory, guide, and inventories, combined with a mechanism to provide responses to queries, possibly including ordering data (Source: Earth Science and Applications Data System) Catalog System: An implementation of a directory, plus a guide and/or inventories, integrated with user support mechanisms that provide data access and answers to inquires Capabilities may include browsing, data searches, and placing and taking orders A specific implementation of a catalog service (Source: Earth Science and Applications Data System, Interagency Working Group on Data Management for Global Change, European Patent Organisation) Service: A distinct part of the functionality that is provided by an entity through interfaces (Source: ISO 19119: Geographic information – Services) Interface: A named set of operations that characterize the behavior of an entity (Source: ISO 19119: Geographic information – Services) Operation: A specification of a transformation or query that an object may be called to execute (Source: ISO 19119: Geographic information – Services) Transfer Protocol: A common set of rules for defining interactions between distributed systems (Source: 19118: Geographic information - Encoding) 177 178 Chapter XXIII Geospatial Semantic Web: Critical Issues Peisheng Zhao George Mason University, USA Liping Di George Mason University, USA Wenli Yang George Mason University, USA Genong Yu George Mason University, USA Peng Yue George Mason University, USA Abstr act The Semantic Web technology provides a common interoperable framework in which information is given a well-defined meaning such that data and applications can be used by machines for more effective discovery, automation, integration and reuse Parallel to the development of the Semantic Web, the Geospatial Semantic Web – a geospatial domain-specific version of the Semantic Web, is initiated recently Among all the components of the Geospatial Semantic Web, two are especially unique – geospatial ontology and geospatial reasoning This paper is focused on discussing these two critical issues from representation logic to computational logic Copyright © 2009, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited Geospatial Semantic Web Introduct ion Inspired by Tim Berners-Lee (Berners-Lee, 1998; W3C, 2006), inventor of the Web, a growing number of individuals and groups from academia and industry have been evolving the Web into another level - the Semantic Web By representing not only words, but their definitions and contexts, the Semantic Web provides a common interoperable framework in which information is given a well-defined meaning such that data and applications can be used by machines (reasoning) for more effective discovery, automation, integration and reuse across various application, enterprise and community boundaries Compared to the conventional Web, the Semantic Web excels in two aspects (W3C, 2006): 1) common formats for data interchange (the original Web only had interchange of documents) and 2) a language for recording how the data relates to real world objects With such advancements, reasoning engines and Web-crawling agents can go one step further – and inductively respond to questions such as “which airfields within 500 miles of Kandahar support C5A aircraft?” rather than simply returning Web pages that contain the text “airfield” and “Kandahar”, which most engines today Figure shows the hierarchical architecture of the Semantic Web At the bottom level, XML (Extensible Markup Language) provides syntax to represent structured documents with a userdefined vocabulary but does not necessarily guarantee well-defined semantic constraints on these documents And XML schema defines the structure of an XML document RDF (Resource Description Framework) is a basic data model with XML syntax that identifies objects (“resources”) and their relations to allow information to be exchanged between applications without loss of meaning RDFS (RDF Schema) is a semantic extension of RDF for describing the properties of generalization-hierarchies and classes of RDF resources OWL (Web Ontology Language) adds vocabulary to explicitly represent the meaning of terms and their relationships, such as relations between classes (e.g disjointness), cardinality (e.g., “exactly one”), equality and enumerated classes The logic layer represents the facts and derives knowledge, and deductive process and proof validation are deduced by the proof layer A digital signature can be used to sign and export the derived knowledge A trust layer provides the trust level or a rating of its quality in order to help users building confidence in the process Figure Semantic Web architecture (Berners-Lee, 2000) 179 Geospatial Semantic Web and quality of information(Antoniou & Harmelen, 2004) Parallel to the development of the Semantic Web, the Geospatial Semantic Web – a geospatial domain-specific version of the Semantic Web, is initiated recently Because geospatial information is heterogeneous, i.e multi-source, multi-format, multi-scale, and multi-disciplinary, the importance of semantics on accessing and integration of distributed geospatial information has long been recognized (Sheth, 1999) The advent of the Semantic Web promises a generic framework to use ontologies to capture the meanings and relations for information retrieval But this framework does not relate explicitly to some of the most basic geospatial entities, properties and relationships that are most critical to a particular geospatial information processing task To better support the discovery, retrieval and consumption of geospatial information, the Geospatial Semantic Web is initiated to create and manage geospatial ontologies to capture the semantic network of geospatial world and allow intelligent applications to take advantage of build-in geospatial reasoning capabilities for deriving knowledge It will so by incorporating geospatial data semantics and exploiting the semantics of both the processing of geospatial relationships and the description of tightly-coupled service content (Egenhofer, 2002; Lieberman, Pehle, & Dean, 2005) The Geospatial Semantic Web was identified as an immediately-considered research priority early in 2002 (Fonseca & Sheth, 2002) by UCGIS (University Consortium for Geospatial Information Science) As an international voluntary consensus standards organization, OGC (Open Geospatial Consortium) conducted the Geospatial Semantic Web Interoperability Experiment (GSW-IE) in 2005 aiming to develop a method of discovering, querying and collecting geospatial content on the basis of formal semantic specifications The architecture of the Geospatial Semantic Web is similar to that portrayed in Figure The Geospatial Semantic Web and the Semantic Web 180 share top level (general) ontology, ontological languages, and general reasoning mechanisms The Geospatial Semantic Web extends the Semantic Web with domain-specific components Among all the components of the Geospatial Semantic Web, two are especially unique – geospatial ontology and geospatial reasoning The former aims at expressing geospatial concepts and relationships, specialized processing of geospatial rules and relationships, and self-described Web service with its highly dynamic geospatial content beyond the purely lexical and syntactic level (Egenhofer, 2002; Lieberman et al., 2005; O'Dea, Geoghegan, & Ekins, 2005) The latter embraces sets of geospatial inference rules on the basis of geospatial ontologies and techniques to conduct automated geospatial reasoning by machine with less human interaction for deriving geospatial knowledge These two are the foci to be elaborated in the following two sections in this paper Two application cases are presented to show the syndicated achievements of the Geospatial Semantic Web A short summary is given at the end G eosp at ial O nt o logy It is widely recognized that ontology is critical for the development of the Semantic Web Ontology originated from philosophy as a reference to the nature and the organization of reality In general, an ontology is a “specification of a conceptualization” (Gruber, 1993) In the computer science domain, ontology provides a commonly agreed upon understanding of domain knowledge in a generic way for sharing across applications and groups (Chandrasekaran, Johnson, & Benjamins, 1999) Typically, ontology consists of a list of terms (classes of objects) and the relationships between those terms Moreover, ontology can also represent property information (e.g., an airfield has runways), value restrictions (e.g., aircraft can only take off at an airfield), disjointness statements (e.g., aircraft and train are disjoint), and specifi- Geospatial Semantic Web cation of logical relationships between objects (e.g., a runway must be at least 400 meters long for supporting C5A aircraft) In the geospatial domain, a specific range of geospatial ontolgoies are needed to define a formal vocabulary that sufficiently captures the semantic details of geospatial concepts, categories, relations and processes as well as their interrelations at different levels A geospatial ontology does not simply give a definition, but also represents relationships between concepts For example, an ontological definition of “surface water” describes its properties and characteristics but also carries relationship meanings to other entities, such as “surface water” belongs to “hydrosphere”, and “river” is a kind of “surface water” A well-formatted geospatial ontology is very useful in the following areas: • • Interoperability Since the geospatial sciences deal with phenomena across a variety of scales and disciplines, the semantics of geospatial information is essential for the development of interoperable geospatial software and data formats Geospatial ontology provides a common understanding of not only general geospatial concepts but also complex geospatial scientific computing Through geospatial ontology, the different geospatial data models and representations can be integrated Spatial reasoning about geospatial associations and patterns, e.g., topological relations (connectivity, adjacency and intersection of geospatial objects), cardinal direction (relative directions among geospatial objects, such as east, west and southwest), and proximity relations (geographical distance between geospatial objects, such as A is close to B and X is very far from Y, and contextual relations, such as an obstacle separates two objects that would be considered nearby space, but are considered far because of the obstacle) (Arpinar, Sheth, & Ramakrishnan, 2004) • Reuse and organization of information, such as standardizing libraries or repositories of geospatial information and workflows Compared to general ontologies, geospatial ontologies specifically encode 1) spatial concepts, e.g., location and units, 2) spatial relationships, e.g., inside, near and east, 3) physical facts, e.g., physical phenomena, physical properties and physical substances, 4) geospatial data, e.g., data properties, such as instruments, platforms and sensors, and 5) geospatial computing processes, e.g., disciplines, parameters and algorithms According to the interactions and the role within the context of the Geospatial Semantic Web, geospatial ontology can be classified into several large groups with hierarchical relationships as Figure in which the ontologies at upper levels are consistent to the ontologies at lower levels General ontology is the core upper level vocabulary representing common human consensus reality that all other ontologies must reference It is domain independent The widely used Dublin Core Metadata (Dublin, 2006) provides a standard for metadata vocabularies to describe resources that enable the development of more intelligent information discovery systems OpenCyc (OpenCyc, 2006) is the world's largest and most complete general knowledge base and commonsense reasoning engine defining more than 47,000 upper level concepts and 306,000 assertions about these concepts Geospatial feature ontology, defining geospatial entities and physical phenomena, provides the core geospatial vocabulary and structure, and forms the ontological foundation of geospatial information It should be coordinated with the development of geospatial standards to define its scope and content, such as the ISO 19100 series and the OGC specifications Geospatial factor ontology describes geospatial location, unit conversion factors and numerical extensions To enable geospatial topological, proximity 181 Towards Automatic Composition of Geospatial Web Services to annotate data and services (Lutz and Klien, 2006) Ontologies, related in both simple taxonomic and non-taxonomic ways, are employed using subsumption reasoning (Baader and Nutt, 2003) to improve service discovery and the recall and resolution of data Template operations are introduced for semantic annotation of services input/output and functionality (Lutz, 2004) K ey cons ider at ions and Poss ib le S o lut ions The related work described so far helps identify the particular requirements of the geospatial domain that automatic service composition satisfies • • Data-intensive: Geospatial Data for processing is always high volume and diversified with inherent disciplinary complexity Data plays an important role in geospatial service composition since its rich, explicit, and formalized semantics (other than traditional metadata) of geospatial data allow a machine to understand and automatically discover the appropriate data (other than by keyword matching) for a service’s input Formal conceptualization of data semantics requires the combination of geospatial domain knowledge with certain knowledge representation techniques, e.g., formal ontologies Semantic Web standards such as OWL provide such support In addition, the complete data semantics can help the metadata tracking in the service chaining which ensures the trustable data for users (Alameh, 2003) Compute-intensive: Geoprocessing functions are complex, time-consuming and data-dependent For compute-intensive applications, offline planning is preferred to online planning In offline planning, the process model for service composition is generated before the execution of the service • component, e.g., SWORD (Ponnekanti and Fox, 2002) Online planning is useful usually when the information for the generation of the process model is incomplete and thus requires the invocation of a service component as the information provider The actual process model is created at run time, e.g., SHOP2 (Wu et al., 2003) Given the resources consumed by geospatial processing services, offline planning can bring predictability and efficiency Alternative process models should be created to deal with the possible inapplicability of certain process models In addition, service semantics also need to be explicitly formalized and the inherent relation to data should be identified The ontology descriptions using OWL-S will enable the reasoning and chaining of services such as the aggregate service and workflow managed chaining identified in the OGC geoprocessing architecture Analysis-intensive: Geospatial application involves diverse sources of data and complex processing functions Analysis-intensive applications require that the inherent relations between multiple geospatial data and services should be captured at an upper level These relations can be constructed through geospatial ontologies and rules (e.g., SWRL) (Horrocks et al., 2004), serving as the knowledge base for AI methods One example rule is that WCTS can be introduced in the service chain automatically when the spatial projection of the available data can not satisfy the spatial projection requirement of the service’s input C onc lus ion Wide application of Web service technologies to the geospatial domain opens the challenge for geospatial Web service composition Auto- 209 Towards Automatic Composition of Geospatial Web Services matic composition of geospatial Web services provides a promising prospect to facilitate the use of geospatial information over the Web This paper introduces techniques for automatic Web service composition and current progress related to the geospatial domain The key considerations discussed in this paper offer a guide to the further exploration of this subject reference S Aalst, W (2003) Don’t Go with the Flow: Web Services Composition Standards Exposed, IEEE intelligent systems, January/February (2003), pp 72-76 Aissi, S., Malu, P., & Srinivasan, K (2002, May) E-Business Process Modeling: The Next Big Step IEEE Computer, 35(5), 55-62 Alameh N (2003) Chaining Geographic Information Web Services IEEE Internet Computing, 7(5), 22-29 Altintas, I., Birnbaum, A., Baldridge, K., Sudholt, W., Miller, M., Amoreira, C., Potier, Y., & Ludäscher, B (2004) A Framework for the Design and Reuse of Grid Workflows Intl Workshop on Scientific Applications on Grid Computing, SAG 2004, LNCS 3458, Springer, pp 119 – 132 Baader, F., & Nutt, W (2003), Basic description logics In The Description Logic Handbook , F Baader, D Calvanese, D.McGuinness, D Nardi and P Patel-Schneider (Eds.), pp 43–95 (Cambridge University Press) Benatallah, B., Dumas, M., Fauvet, M-C., & Abhi, F.A (2001) Towards patterns of Web service composition Technical report, UNSWCSE -TR-0111, University of New South Wales, 35pp Booth, M., Haas, H., McCabe, F., Newcomer, E., Champion, M., Ferris, C., & Orchard, D., (eds.) (2004) Web Services Architecture W3C Working Group Note, 11 February 2004, W3C, http://www w3.org/TR/ws-arch/ 210 Bowers, S., & Ludäscher, B (2004) An Ontology-Driven Framework for Data Transformation in Scientific Workflows In Proc of the Intl Workshop on Data Integration in the Life Sciences (DILS), Volume 2994 of LNCS, Springer, 2004 pp 1-16 Casati, F., Ilnicki, S., & Jin, L (2000) Adaptive and dynamic service composition in EFlow In Proceedings of 12th International Conference on Advanced Information Systems Engineering(CAiSE), Stockholm, Sweden, June 2000 Springer Verlag Casati, F., Sayal, M., & Shan, M (2001) Developing E-Services for composing E-Services In Proceedings of 13th International Conference on Advanced Information Systems Engineering(CAiSE), Interlaken, Switzerland, June 2001 Springer Verlag p 27 Chandrasekaran, S., Madden, S., & Ionescu, M (2000) Ninja Paths: An Architecture for Composing Services over Wide Area Networks, CS262 class project writeup, UC Berkeley, 2000 15 pp Dean, M., & Schreiber, G., (eds.) (2004) OWL Web Ontology Language Reference, W3C http://www w3.org/TR/owl-ref Di, L (2004) Distributed Geospatial Information Services-Architectures, Standards, and Research Issues The International Archives of Photogrammetry, Remote Sensing, and Spatial Information Sciences, XXXV(2), Commission II, Di, L., Zhao, P., Yang, W., Yu, G., & Yue, P (2005, July) Intelligent Geospatial Web Services Geoscience and Remote Sensing Symposium, 2005 IGARSS ‘05 Proceedings 2005 IEEE International, 2, 1229 – 1232 Horrocks, I., et al (2004) Semantic Web Rule Language (SWRL) http://www.w3.org/Submission/SWRL/ Jaeger, E., Altintas, I., Zhang, J., Ludäscher, B., Pennington, D., & Michener, W (2005) A Towards Automatic Composition of Geospatial Web Services Scientific Workflow Approach to Distributed Geospatial Data Processing using Web Services, 17th International Conference on Scientific and Statistical Database Management (SSDBM’05), 27-29 June 2005, Santa Barbara, California pp 87-90 Klusch, M., Gerber, A., & Schmidt, M (2005) Semantic Web Service Composition Planning with OWLS-Xplan, Agents and the Semantic Web 2005 AAAI Fall Symposium Series ,Arlington,Virginia, USA, November, 2005 pp Ludäscher, B., Altintas, I., Berkley, C., Higgins, D., Jaeger, E., Jones, M., Lee, E., Tao, J., & Zhao, Y (2005) Scientific Workflow Management and the Kepler System, Concurrency and Computation: Practice & Experience, Special Issue on Scientific Workflows, to appear, 2005 19pp Lutz, M (2004) Non-taxonomic Relations in Semantic Service Discovery and composition, In Maurer, F & Ruhe, G (Eds.), Proceedings of the First “Ontology in Action” Workshop, in conjunction with the Sixteenth International Conference on Software Engineering & Knowledge Engineering (SEKE’2004) pp 482–485 Lutz, M., & Klien, E (2006) Ontology-Based Retrieval of Geographic Information International Journal of Geographical Information Science, 20(3), 233-260 Martin, D., et al (2004) OWL-based Web Service Ontology (OWL-S) http://www.daml.org/services/owl-s/1.1/ McIlraith, S A., & Son, T C (2002) Adapting Golog for Composition of Semantic Web Services In D Fensel, F Giunchiglia, D McGuinness, and M.-A Williams, (Eds.), Proc of the 8th International Conference on Principles and Knowledge Representation and Reasoning (KR’02), pages 482 496, France, 2002 Morgan Kaufmann Publishers Medjahed, B., Bouguettaya, A., & Elmagarmid, A K (2003, November) Composing Web services on the Semantic Web The VLDB Journal, 12(4), 333-351 Narayanan, S., & McIlraith, S A (2002) Simulation, verification and automated composition of web services In WWW ‘02: Proceedings of the eleventh international conference on World Wide Web, pages 77-88.ACM Press Peer, J (2005) Web Service Composition as AI Planning - A Survey University of St.Gallen, Switzerland, 63 pp Percivall, G., (ed.) (2002) The OpenGIS Abstact Specification, Topic 12: OpenGIS Service Architecture, Version 4.3, OGC 02-112 Open GIS Consortium Inc 78 pp Ponnekanti, S R., & Fox, A (2002) SWORD: A Developer Toolkit for Web Service Composition In Proceedings of the International World Wide Web Conference, Honolulu, Hawaii, USA, May 2002 pp 83-107 Rao, J., Küngas, P., & Matskin, M (2004) Logicbased Web Services Composition: From Service Description to Process Model In Proceedings of the 2004 International Conference on Web Services, San Diego, USA, July 2004 pp 446-453 Rao, J., & Su, X (2004) A Survey of Automated Web Service Composition Methods Proceedings of the First International Workshop on Semantic Web Services and Web Process Composition, SWSWPC 2004, California, USA pp 43-54 Russel, S & Norvig, P (2002) Artificial Intelligence: A Modern Approach Prentice-Hall Inc Prentice Hall; 2nd edition (December 20, 2002) p 375 Sheshagiri, M., desJardins, M., & Finin, T (2003) A planner for composing services described in DAML-S Proceedings of AAMAS 2003 Workshop on Web Services and Agent-Based Engineering pp 211 Towards Automatic Composition of Geospatial Web Services Sirin, E., Hendler, J., & Parsia, B (2003) Semiautomatic composition of web services using semantic descriptions In Proc Workshop on Web Services: Modeling, Architecture and Infrastructure (WSMAI), pp 17-24 ICEIS Press, 2003 Process Instantiation: Dynamic selection and binding of individual Web services with the operators in the process model The criteria for selection can be based on the QoS (Quality of Service) information Wu, D., Parsia, B., Sirin, E., Hendler, J., & Nau, D (2003) Automating DAML-S Web services composition using SHOP2 In Proceedings of 2nd International Semantic Web Conference (ISWC2003), Sanibel Island, Florida, October 2003 Process Model: An ordered sequence of cooperative operators An operator represents a certain type of service which functionality and input/output data are consistent semantically Two consecutive operators must exchange data with same semantics Information about an individual service at a particular address is not included key T er ms Geospatial Web Service: A geospatial Web service is a modular Web application that provides services on geospatial data, information, or knowledge Geospatial Web services can perform any function from a simple geospatial data request to complex geospatial analysis Geospatial Web Service System: A geospatial problem-solving environment based on Web service technologies OGC Web Service: OGC Web Services are an evolutionary, standards-based framework that enables seamless integration of a variety of online geoprocessing and location services in a distributed environment, regardless of the format, projection, resolution, and the archive location Currently, most OGC Web service implementations provide access via HTTP GET, HTTP POST and not support SOAP Ontology: From the philosophical perspective, the nature and existence of reality From the computer science perspective, a shared and common vocabulary for a knowledge domain, definition of its terms, and the relations among them 212 Semantic Web: A framework for sharing data across boundaries, to serve people more conveniently Initiated by World Wide Web Consortium (W3C) Service Composition: The process of creating a service chain that includes service discovery, service selection, composition method and composite service representation E ndnotes The work presented in this chapter was carried out while the author was still working with the Center for Spatial Information Science and Systems (CSISS), George Mason University, USA Hereafter, service, if not specified, means Web service This is not Web service, service communication is based on Java RMI 213 Chapter XXVII Grid Computing and its Application to Geoinformatics Aijun Chen George Mason University, USA Liping Di George Mason University, USA Yuqi Bai George Mason University, USA Yaxing Wei George Mason University, USA Abstr act The definition of the Grid computing and its application to geoinformatics are introduced Not only the comparison of power Grid and computing Grid is illustrated, also Web technology and Grid technology are compared The Hourglass Model of Grid architecture is depicted The layered Grid architecture, relating to Internet protocol architecture, consists of the fabric (computer, storage, switches, etc.) layer, connectivity layer, resource layer, collective layer, and application layer Grid computing has been applied to many disciplines and research areas, such as physics, Earth science, astronomy, bioinformatics, etc By applying the Grid computing to Open Geospatial Consortium, Inc.’s Web services and geospatial standards from International Organization for Standardization, US Federal Geographic Data Committee and US NASA, a geospatial Grid is proposed here, which consisting of Grid-managed geospatial data and Grid-enabled geospatial services Copyright © 2009, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited Grid Computing and its Application to Geoinformatics INTRODUCT ION Grid computing, defined in the mid-1990s, has appeared as a new e-science information technology for addressing the formidable challenges associated with the integration of heterogeneous computing systems and data resources Its goal is to build a global computing space with global resources by securely bringing together geographically and organizationally dispersed computational resources to provide users with advanced ubiquitous distributed sharable computing (Foster et al., 2002; 2001) Currently, the most popular and widely used Grid software is Globus Toolkit The newest 4.0.1 version uses Open Grid Service Architecture (OGSA) and Web Service Resource Framework (WSRF) specifications to build the computational Grid Geoinformatics is the science and art of acquiring, archiving, manipulating, analyzing, communicating, and utilizing spatially explicit data for understanding the physical, biological, and social systems on or near the Earth’s surface (Di, et al 2005) In order to share distributed geospatial resources and facilitate interoperability, the Open Geospatial Consortium (OGC), an industry-government-academia consortium, has developed a set of widely accepted Web-based interoperability standards and protocols, such as Web Mapping Service (WMS), Web Coverage Service (WCS), and Catalogue Service - Web profile (CS/W) specifications These services have been widely developed and used by diverse geoinformatics communities Grid technologies provide a platform to make every digital resource securely sharable and usable by every qualified user no matter how the resources are related to a discipline, organization, science, or anything else The OGC is leading the development of geospatial resource standards for sharing geospatial data and geoprocessing services Therefore, it is very natural to apply Grid technology to geoinformatics to reach the goal that is being pursued by the OGC It not 214 only extends Grid capabilities to the geospatial discipline for enriching Grid technology, but also utilizes the newest computer technology for advancing discipline research B ACKGROUND Grid computing is not simply a means for researchers to existing research faster, but also promises them a number of new capabilities While the ability to carry out existing experiments in less time is definitely beneficial, other features such as ease of collaboration, reduced cost, and access to increased resources and instrumental results, allow more advanced research to be carried out In order to achieve these goals, considerable work has been put into Grid-enabling technology, including Grid architecture, Grid middleware, authentication mechanisms, resource schedulers, data management and information services These technologies form the basic services for achieving the goals of the Grid – creating e-Research and e-Commercial environments (Hey & Trefethen 2002) The ultimate target of Grid Computing is to establish the Computational Grid whose idea is analogous to the electric power grid where power generators are distributed, but the users are able to access electric power without concern for the source of energy and its location (Figure 1) Today, the Grid computing technology is trying to provide computing capabilities as the electric power grid provides energy capabilities by using the same characteristics such as reliability, scalability, security, low cost, and convenience Grid technology has boomed as a result of the Internet and the rapid development of Web technology As the Web revolutionizes information sharing by providing a universal protocol and syntax (HTTP and HTML) for information exchange, The Grid, which mainly consists of the standard protocols and syntaxes, comes up for revolutionizing general resource sharing The Grid Computing and its Application to Geoinformatics Figure Computational power grid analogous to electric power grid (Myrseth, 2002) Figure Comparison of Web technology with Grid technology General Information Resources Central Consumer d Resources The Web Regional Grid Regional Grid Protocols (HTTP) Syntaxes (HTML) Web-based Information Sharing Local Grid General Resources The Grid Consumer Local Grid similarity between Web and Grid technology is illustrated in Figure (WS-* means that Web Service Resources Framework (WSRF)-related specifications about Web Services) In the research and development of Grid technology, Services and Protocols are first, then, are Application Programming Interfaces (APIs) and Software Development Kits (SDKs) for accelerating resource sharing and enhancing application portability They are described through the Grid architecture Therefore, Grid architecture defines Protocols, Services, APIs and SDKs, and their relationships The Grid architecture adopts the well-known Hourglass Model (Figure 3) which is also used by Web technology The narrow neck of the hourglass defines a small set of core abstractions and protocols onto which many different high-level behaviors can be mapped (the top of the hourglass), and from which they can be mapped onto many different underlying technologies (the base of the hourglass) The Layered Grid Architecture in nature and its relationship to the Internet protocol architecture has been shown in Figure (Foster et al., 2001) Protocols (WS-*) Syntaxes (XML) Grid-based Resources Sharing The Fabric layer provides resources to which shared access is mediated by Grid protocols Fabric components implement the local, resource-specific operations that occur on specific resources as a result of sharing operations at higher level The Connectivity layer defines core communication and authentication protocols required for Grid-specific network transactions Communication protocols enable the exchange of data between Fabric layer resources Authentication protocols provide secure mechanisms for verifyFigure The hourglass model of Grid architecture Diverse global services Core services and protocols Local O 215 Grid Computing and its Application to Geoinformatics Figure Layered Grid Architecture and its relationship to the Internet protocol architecture Transport Internet Fabric (computers, storage,etc,) ing the identity of users and resources, such as Grid Security Infrastructure The Resource layer defines protocols (and APIs and SDKs) for the secure negotiation, initiation, monitoring, control, accounting, and payment of sharing operations on individual resources Information protocols and Management protocols are two primary classes of Resource layer protocols The Collective layer provides protocols and services (and APIs and SDKs) that are not associated with any one specific resource but rather are global in nature and capture interactions across collections of resources Collective components build on the narrow Resource and Connectivity layer represented by the “neck” in the protocol hourglass The Applications layer comprises the user applications that operate within a Virtual Organization (VO) environment Applications are constructed in terms of, and by calling upon, services defined at any layer At each layer, welldefined protocols provide access to some useful services And at each layer, APIs may also be defined with the appropriate service(s) to perform desired actions Today, Grid Computing has been applied to many disciplines and research areas, and in the future, more research and commercial fields will be involved The reason is that many sci- 216 Link Internet Protocol Architecture Grid Protocol Architecture Resource entific and commercial endeavors currently or will soon need to contend with the analysis of large datasets by a widely distributed user base with intensive computation requirements Some examples include: The Grid Physics Network (GriPhyN); The Particle Physics Data Grid (PPDG); The Earth System Grid (ESG); Applications in Astronomy; Applications in Bioinformatics Grid technology has already been successfully applied to physics (Beeson et al 2005) The GriPhyN project addresses some important physical issues, such as the Sloan Digital Sky Survey, Detecting Einstein’s Gravitational Waves, and High-Energy Particle Physics (GriPhyN 2005) The Particle Physica Data Grid (PPDG) is another successful Grid application in physical sciences It is developing a Grid system vertically integrating experiment-specific applications, Grid technologies, facility computation and storage resources to provide effective end-to-end capabilities (PPDG 2005) The future of Grid in physics is very promising The Earth System Grid (ESG), a Grid application in earth sciences, links supercomputers with large-scale data and analysis servers located at numerous national labs and research centers to Grid Computing and its Application to Geoinformatics create a powerful environment for next generation climate research This virtual collaborative environment links distributed centers, users, models, and data Creation of this environment will significantly increase the scientific productivity of U.S climate researchers (Pouchard et al 2003; ESG 2005) There are also many Grid applications in astronomy TeraGrid-based US National Virtual Observatory is making astronomical data easier to use through the creation and adoption of standards (US NVO, 2006) AstroGrid is to build a data-grid for UK’s astronomy, which will form the UK contribution to a global Virtual Observatory (AstroGrid, 2006) The Japanese Virtual Observatory is designed to provide seamless access to the federated databases and various data analysis tools by using state-of-the-art Grid technology (JVO, 2006) Grid Computing has also been applied to bioinformatics and has produced great achievements Some examples are the BioGrid in Austria, the Bioportal in the USA, Bioinformatics Grid in Europe, the National Bioinformatics Grid in South Africa, and the Open Bioinformatics Grid in Japan Grid Computing has been applied to many diverse disciplines with many fruitful results and has a promising future The concept of “Geospatial Grid” addresses how to use Grid technology to share and serve geospatial resources in geoinformatics Geospatial Grid was first discussed by (Di et al 2005) Geospatial data can be served in its original format in a very simple way, e.g FTP online, DVD offline Users usually get a lot of useless data and have to spend more time and effort to get what they want However, based on the Geospatial Grid, data can be securely served online in a user customizable way with very flexible, standards-compliant, user interfaces GEOSP AT IAL D AT A SERV ING and SERV ICES OFFER ING How can geospatial data be served through Grid Computing? Combined with geospatial standards, a grid computing-based geospatial computational Grid can be built up as a Grid-enabled geospatial infrastructure for serving data and offering services The Geospatial Grid is the extension and application of Grid technology for geoinformatics It makes Grid technology geospatially enabled, and also makes archiving, managing, querying, and serving of geospatial resources Grid-enabled It makes Grid technology transparent to OGC users who can take advantage of the Grid technology, but not need to learn more details about it The Geospatial Grid consists of Grid-managed geospatial data and Grid-enabled geospatial services with their related other resources It provides completely transparent access to geospatial resources dispersing over the VO, which may consist of any organization around the world There are two kinds of standard interfaces for two kinds of users Both Grid users and web users are able to transparently access geospatial data and services and to customize geospatial data Not only can researchers and scientists focus on science rather than issues of locating, selecting and obtaining data, and data receiving and format, but also volumes of geospatial data can be accessed quickly and massive idle computing resources over the VO can be utilized efficiently Geospatial metadata standards are a very popular and effective solution to registering, managing, sharing and serving geospatial data, but the solution covers only management of one copy of data It is common that several replicas of the same data may exist in different organizations at the same time How to efficiently archive, manage and serve those replicas of data is very important to not only data users, but also data providers 217 Grid Computing and its Application to Geoinformatics Taken together with the Grid technologies, geospatial metadata standards (e.g ISO 19115 I & II) and OGC geospatial web services standards, a Grid-enabled Geospatial Catalog Information Model is devised to describe geospatial data and their replicas Its corresponding catalog service is developed to archive, manage, and index massive geospatial data Using the Grid Replica Location Service (Globus, 2006), the data replicas can be efficiently indexed in the catalog service and provided for user retrieval The user-transparent data serving is provided through the Grid-enabled geospatial services These geospatial Grid services greatly improve and make more efficient access to, management of, and analysis of distributed computing and data resources for many Earth science-related data, and compute-intensive scientific research and applications Also, built-in geospatial processing, such as subsetting, reprojection, resampling, and reformatting are provided with geospatial Grid services The following geospatial Grid services are designed and developed for serving geospatial data and constructing data service chaining based on Grid computing Grid-enabled Catalog Service for the Web (GCS/W) provides Grid-based archiving, publication, management, and retrieval of geospatial data and services and furnishes transparent access to the replica data and related services under the Grid environment (Yu, et al 2005) The information model of GCS/W is based on that of OGC CS/W Any potentially detailed and rich set of geospatial data/information can be registered into GCS/W and served through GWCS/GWMS Grid-enabled Web Coverage Service (GWCS) provides OGC standard interfaces that can be accessed by Grid users and web users for access to all geospatial data registered in GCS/W The getCoverage interface is the main function for serving geospatial data Its built-in geoprocessing functions provide users very flexible options to customize the data they want Any Grid services and OGC standards-compliant clients can access 218 and interact with GWCS It provides access to all geospatial data registered in GCS/W and is managed based on Grid in forms that are useful for client-side rendering, value-added processing, and inputting into scientific models and other geospatial processing systems The Grid-enabled Web Map Service (GWMS) responds to user requests to dynamically produce static maps from spatially referenced geographic data by filtering and portraying geospatial data in the Grid environment The getMap standard interface is provided It can securely access any geospatial data to produce a map in the format required by the user Not only can all OGC standard interfaces discussed above be implemented in the Grid environment, but, with the collaboration of intelligent Replica Optimization Service, faster geoprocessing speed, better serving performance and data quality can be achieved The Replica Optimization Service (Chen et al., 2006) is designed to receive data requests from a user’s web interface, intelligently distribute the request to one of the best GWCS/GWMS in the VO, receive resultant data, and return them to users ROS invokes the Grid Replica Location Service and IndexService to select the services’ location where the local machine provides the best performance and where the data are of better quality Usually, the preference is to select the data and the related service used to serve this data from the same machine, but sometimes data and its related service at a different machine are selected if the machine where the service is hosted has excellent performance, the network bandwidth is faster and the data volume is not very large GridFTP is invoked to securely transfer the data to the machine where the service is located The Grid Security Infrastructure (GSI) (Globus, 2006) is applied to form the geospatial VO of the Geospatial Grid The geospatial VO now consists of the VO from the Center for Spatial Information Science and Systems (CSISS) of George Mason University (GMU), from the Ames Grid Computing and its Application to Geoinformatics Research Center (ARC) of National Aeronautics and Space Administration (NASA) , and from the Lawrence Livermore National Laboratory (LLNL) of the Department of Energy (DOE) Each has its own Certificate Authority that is used to issue host, user and service certificates for qualified users from the geospatial VO Authorization and authentication are set up between any two VOs and any two machines in the geospatial VO Thus, any Grid service in the geospatial VO can invoke or be invoked by other Grid services of this VO FUTURE TRENDS An ontology–based semantic Grid would be the primary driver of Grid technology in the near future In parallel with the development of Grid, the Semantic Web has emerged as a lightweight solution for dissemination of knowledge and an interface for computational services (namely Web Services) (Flahive et al 2005) Future research looks forward to a close integration of the Grid infrastructure with semantic content - a Semantic Grid based on ontology A consistent, shared ontology is of critical importance to the sharing of geospatial knowledge, and has long-term value in supporting a systemslevel approach to geoinformatics It would be in widespread use for geospatial data mining and data visualization, and has great potential for further integration of data across the different levels of Grid computing Intelligent virtual modeling based on Grid service chaining has high priority in the evolution of Grid computing Very complicated geospatial applications can be decomposed and represented through a workflow that includes the intelligence of the geospatial experts and can be expressed by a service chain (Chen et al 2005; Foster et al 2003) The key issues include the relationship between geoprocessing modules that are Grid services, the concepts of virtual geospatial products, and a new architecture for the geospatial Grid that can involve multiple geospatial archives and organizations for providing seamless discovery of and access to distributed geospatial resources Then, an ontology-driven semantic Grid will provide much more intelligence to geospatial applications modeling and on-the-fly data producing CONC LUS ION Serving geospatial data through Grid computing is very promising Its implementation is discussed by depicting the components and establishment of an open, secure, optimized, and flexible geospatial Grid based on the Grid technology The geospatial Grid has two main functions: serving geospatial data and offering services Services can be used to serve data or model new applications A prototype geospatial Grid has been implemented by three research centers: GMU/CSISS, NASA/ARC and DOE/LLNL REFERENCES AstroGrid, (2006) Introduction Retrieved on August 1, 2006, from http://www.astrogrid.org/ Beeson, B., Melnikoff, S., Venugopal, S., & Barnes, D.G (2005) A Portal for Grid-enabled Physics Proceedings of the 2005 Australasian workshop on Grid computing and e-research, 44, 13–20 Newcastle, New South Wales, Australia Chen, A., Di, L., Wei, Y., Bai, Y., & Liu, Y (2006) An optimized Grid-based, OGC standards-compliant collaborative software system for serving NASA geospatial data 2nd IEEE Systems and Software Week (SASW 2006), 30th Annual Software Engineering Workshop Loyola College Graduate Center, Columbia, MD, USA Apr 25-27 Chen, A., Di, L., Wei, Y., Liu, Y., Bai, Y., Hu, C., & Mehrotra, P (2005) Grid enabled Geospatial 219 Grid Computing and its Application to Geoinformatics Catalogue Web Services American Society for Photogrammetry and Remote Sensing 2005 Baltimore, MD, USA Mar 7-11 Di, L., Yang, W., Chen, A., Liu, Y., & Wei, Y (2005) The Development of a Geospatial Grid by Integrating OGC Web Services with Globusbased Grid Technology GridWorld™/GGF15 Boston, MA, USA Oct 3-6 ESG, (2005) Earth System Grid Retrieved on March 10, 2006, from https://www.earthsystemgrid.org/home/publicHomePage.do Foster, I., Vockler, M.W., & Zhao Y (2003) The Virutal Data Grid: A New Model and Architecture for Data-Intensive Collaboration Conference on Innovative Data Systems Research (CIDR) 2003 Asilomar, CA, USA Jan 5-8 Foster, I., Kesselman, C., Nick, J.M., & Tuecke, S (2002) The Physiology of the Grid: An open Grid services architecture for distributed systems integration Tech report, Glous Project Retrieved on March 6, 2006, from http://www.globus.org/ research/papers/ogsa.pdf Foster, I., Kesselman, C., & Tuecke, S (2001) The Anatomy of the Grid – Enabling Scalable Virtual Organizations Intl J of High Performance Computing Applications, 15(3), 200-222 Flahive, A., Rahayu, W., Taniar, D., & Apduhan, B (2205) A Distributed Ontology Framework in the Semantic Grid Environment 19th International Conference on Advanced Information Networking and Applications (AINA'05), 2, 193-196 (INA, USW, WAMIS, and IPv6 papers) Globus Toolkit Team (2006) Globus Toolkit 4.0 Release Manuals Retrieved on March, 2006, from http://www.globus.org/toolkit/docs/4.0/ GriPhyN (2005) Grid Physics Network Retrieved on March 25, 2006, from http://www griphyn.org/projinfo/intro/appl_domains.php 220 Hey, T., & Trefethen, A E (2002) The UK EScience Core Programme and the Grid Journal of Future Generation Computer Systems (FGCS) 18(8), 1017-1031 JVO (2006) Virtual Observatory Retrieved on August 1, 2006, from http://www.ivoa.net/pub/ papers/jvo_ho.pdf Myrseth, P (2002) The Nordic Power Market and Its Use of Electronic Commerce Proc OECD Workshop on Business-to-Business Electronic Commerce: Status, Economic Impact and Policy Implications, www.nr.no/~pmyrseth/artikler/ oecd_ie_wokshop_99 Pouchard, L., Cinquini, L., & Strand, G (2003) The Earth System Grid Discovery and Semantic Web Technologies Semantic Web Technologies for Searching and Retrieving Scientific Data ISWC II Sanibel Island, FL Oct 20 PPDG, (2005) Particle Physics Data Grid Retrieved on March 26, 2006, from http://www ppdg.net/ Roure, D (2004) Semantic Grid Vision Retrieved on March 11, 2006, from http://www.semanticgrid org/vision.html US NVO, (2006) Grid Computing Retrieved on August 1, 2006, from http://www.us-vo.org/grid cfm Yu, J., Venugopal, S., & Buyya, R (2005) A Market-Oriented Grid Directory Service for Publication and Discovery of Grid Service Providers and their Services Journal of Supercomputing, Kluwer Academic Publishers, USA key T er ms Geospatial Grid: The extensions and domainspecific applications of the fundamental Grid computing technology in the geospatial discipline, Grid Computing and its Application to Geoinformatics consisting of serving geospatial data and offering geospatial services Geospatial Grid Service: A Grid service used to processing geospatial data with geospatial standards-compliant interfaces Grid Computing: An emerging service-oriented computing model that provides the ability to perform higher throughput and data-intensive computing by securely bringing together geographically and organizationally dispersed computational resources for providing users with advanced ubiquitous distributed sharable computing Grid Service: A processing component in the Grid environment with the WSRF compliant Grid standard interfaces They can invoke each other in the virtual organization Open Grid Service Architecture (OGSA): An architecture for a service-oriented Grid computing environment for business and scientific use It is based on several other web service technologies, notably WSDL and SOAP to provide a distributed interaction and computing architecture on heterogeneous systems so that different types of resources can communicate and share information Virtual Organization (VO): A group of individuals or institutions that securely share the computing resources of a "Grid" for a common goal Web Service Resource Framework (WSRF): A specification to provide a clean set of methods to implement stateful web services that communicate with resource services that allow data to be stored and retrieved It replaces the Open Grid Service Infrastructure (OGSI) 221 222 Chapter XXVIII Sharing of Distributed Geospatial Data through Grid Technology Yaxing Wei George Mason University, USA Liping Di George Mason University, USA Guangxuan Liao University of Science and Technology China, China Baohua Zhao University of Science and Technology China, China Aijun Chen George Mason University, USA Yuqi Bai George Mason University, USA Abstr act With the rapid accumulation of geospatial data and the advancement of geoscience, there is a critical requirement for an infrastructure that can integrate large-scale, heterogeneous, and distributed storage systems for the sharing of geospatial data within multiple user communities This article probes into the feasibility to share distributed geospatial data through Grid computing technology by introducing several major issues (including system heterogeneity, uniform mechanism to publish and discover geospatial data, performance, and security) to be faced by geospatial data sharing and how Grid technology can help to solve these issues Some recent research efforts, such as ESG and the Data Grid system in GMU CSISS, have proven that Grid technology provides a large-scale infrastructure which can seamlessly integrate dispersed geospatial data together and provide uniform and efficient ways to access the data Copyright © 2009, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited Sharing of Distributed Geospatial Data through Grid Technology INTRODUCT ION With the advancement of computing and network technologies, geospatial applications have become more and more important in both academic and commercial areas (Lo and Yeung, 2002) Geospatial applications focus on geospatial data, such as remote sensing data and survey data Huge quantities of raw geospatial data are processed with geospatial algorithms to generate high-level data that contain information which can be further used to extract knowledge for people to better understand the earth The extent of usefulness of geospatial data and applications has been proven across many diverse areas and disciplines, such as meteorology, global climate change, agriculture, forestry, flood monitoring, fire detection and monitoring, and geology (Lamberti and Beco, 2002) This extent is still expanding to make geospatial data and applications close to people’s everyday life, such as the success of Google Earth and the popularity of mobile Global Positioning Systems (GPS) In recent years, there were more and more satellites launched for diverse purposes Currently there are hundreds of living earth observation satellites inspecting the earth and continuously collecting tremendous amounts of data For example, the Earth Observation System (EOS) project of National Aeronautics and Space Administration (NASA) is producing global observation data of the land surface, biosphere, solid Earth, atmosphere, and oceans at the speed of more than 3TB/day and archiving this data into data centers distributed across the United States This exponential accumulation of geospatial data presents new challenges for effective and efficient use of the data Since the mid-1990s, with the exploding growth of the Internet, the focus of computing has shifted from stand-alone and locally networked environments to wide-scale, distributed, and heterogeneous computing infrastructures (Karimi and Peachavanish, 2005) Those changes enabled and compelled new ways of using geospatial data and applications With the advancement of geoscience, more and more complex geospatial algorithms involving geospatial data from multiple sources and domains are designed Contrary to their past monolithic design and implementation, current computing trends suggest new geospatial applications will be distributed and used in heterogeneous network environments The capabilities to efficiently access and share the tremendous amount of distributed geospatial data are crucial to geospatial applications Consequently, there is a need for a large-scale infrastructure which can seamlessly integrate dispersed geospatial data together and provide uniform and efficient ways to access the data Fortunately, recent advancements of computing technologies, especially the emerging Grid technology, can make this a reality B ACKGROUND In the past, geospatial applications were mostly designed for a single workstation or supercomputer The geospatial data they need to process were limited to a single storage system or locally networked storage systems The generated highlevel geospatial data and information were also difficult to be shared by other geospatial user communities due to the isolation and heterogeneity of computing platforms and storage systems Today’s complex geospatial problems need applications that can analyze large quantities of geospatial data coming from different sources which were isolated from each other in the past For example, a statistical wildfire forecast model at 8-km spatial resolution in the conterminous USA used over terabytes of data obtained from different sources, including data derived from measurements of the NASA Earth Observing System satellites and daily weather data provided by National Oceanic and Atmospheric Administration (NOAA) 223 ... intermediary between data partners and client partners Data partners provide information about their data holdings, and client partners develop software to access this information through ECHO... Process Modeling: The Next Big Step IEEE Computer, 35( 5), 55 -62 Alameh N (2003) Chaining Geographic Information Web Services IEEE Internet Computing, 7 (5) , 22-29 Altintas, I., Birnbaum, A., Baldridge,... distributed geographic information services Transactions in GIS, 6(4), 355 -381 Weiss, G (Ed.) (2000) Multiagent Systems: MIT Press Zhou, N., & Li, L (20 05, Nov 27, 20 05) Applying multiagent technology

Ngày đăng: 05/08/2014, 22:22

Tài liệu cùng người dùng

Tài liệu liên quan