Article ID Journal Published Year Pages File Type
506264 Computers, Environment and Urban Systems 2016 13 Pages PDF
Abstract

•A large-scale cybersearch engine that enables the widespread discovery of distributed geospatial data and services•A meta-search-based seed selection and a breadth-first crawling algorithm ensure rapid location of relevant data resources.•A pattern-matching based text analysis supports accurate extraction of multi-type geospatial data.•The software tool PolarHub is proved to be more effective than other spatial data infrastructure solutions.

The advancement of geospatial interoperability research has fostered the proliferation of geospatial resources that are shared and made publicly available on the Web. However, their increasingly availability has made the identification of the web signature of voluminous geospatial resources a major challenge. In this paper, we introduce our solution of a new cyberinfrastructure platform, the PolarHub, that conducts large-scale web crawling to discover distributed geospatial data and service resources and accomplish this goal efficiently and effectively. The PolarHub is built-upon a service-oriented architecture (SOA) and adopts Data Access Object (DAO)-based software design pattern to ensure the extendibility of the software system. The proposed meta-search-based seed selection and pattern-matching based crawling strategy facilitates the rapid resource identification and discovery through constraining the search scope on the Web. In addition, PolarHub introduces the use of advanced asynchronous communication strategy, which combines client-pull and server-push to ensure high efficiency of the crawling system. These unique design features of PolarHub enable a high performance, scalable, sustainable, collaborative, and interactive platform for active geospatial data discovery. Because of OGC's widespread adoption, OGC-compliant web services become the primary search target of PolarHub. Currently, the PolarHub system is up and running and is serving various scientific community that demands geospatial data. We consider PolarHub a significant contribution to the field of information retrieval and geospatial interoperability.

Related Topics
Physical Sciences and Engineering Computer Science Computer Science Applications
Authors
, , ,