کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
426126 686000 2012 7 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Towards efficient data search and subsetting of large-scale atmospheric datasets
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
Towards efficient data search and subsetting of large-scale atmospheric datasets
چکیده انگلیسی

Discovering the correct dataset in an efficient fashion is critical for effective simulations in the atmospheric sciences. Unlike text-based web documents, many of the large scientific datasets often contain binary encoded data that is hard to discover using popular search engines. In the atmospheric sciences, there has been a significant growth in public data hosting services. However, the ability to index and search has been limited by the metadata provided by the data host. We have developed an infrastructure–Atmospheric Data Discovery System (ADDS)–that provides an efficient data discovery environment for observational datasets in the atmospheric sciences. To support complex querying capabilities, we automatically extract and index fine-grained metadata. Datasets are indexed based on periodic crawling of popular sites and also of files requested by the users. Users are allowed to access subsets of a large dataset through our data customization feature. Our focus is the overall architecture, data subsetting scheme, and a performance evaluation of our system.


► Improving search and access to binary datasets published by multiple hosts.
► Community driven scientific data search environment.
► Support for complex queries and rich metadata extraction from binary datasets.
► Efficient subsetting of large atmospheric observational datasets.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Future Generation Computer Systems - Volume 28, Issue 1, January 2012, Pages 112–118
نویسندگان
, , ,