Article ID Journal Published Year Pages File Type
425092 Future Generation Computer Systems 2013 13 Pages PDF
Abstract

We describe the design of a high-throughput storage system, Galileo, for data streams generated in observational settings. To cope with data volumes, the shared nothing architecture in Galileo supports incremental assimilation of nodes, while accounting for heterogeneity in their capabilities. To achieve efficient storage and retrievals of data, Galileo accounts for the geospatial and chronological characteristics of such time-series observational data streams. Our benchmarks demonstrate that Galileo supports high-throughput storage and efficient retrievals of specific portions of large datasets while supporting different types of queries.

► Distributed file system designed for storing billions of scientific data files. ► High-throughput storage of geospatial streams. ► Fast retrieval and query support. ► Distributed computation support. ► Visualization capabilities.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, , ,