کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
459420 696247 2012 16 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
An optimized approach for storing and accessing small files on cloud storage
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر شبکه های کامپیوتری و ارتباطات
پیش نمایش صفحه اول مقاله
An optimized approach for storing and accessing small files on cloud storage
چکیده انگلیسی

Hadoop distributed file system (HDFS) is widely adopted to support Internet services. Unfortunately, native HDFS does not perform well for large numbers but small size files, which has attracted significant attention. This paper firstly analyzes and points out the reasons of small file problem of HDFS: (1) large numbers of small files impose heavy burden on NameNode of HDFS; (2) correlations between small files are not considered for data placement; and (3) no optimization mechanism, such as prefetching, is provided to improve I/O performance. Secondly, in the context of HDFS, the clear cut-off point between large and small files is determined through experimentation, which helps determine ‘how small is small’. Thirdly, according to file correlation features, files are classified into three types: structurally-related files, logically-related files, and independent files. Finally, based on the above three steps, an optimized approach is designed to improve the storage and access efficiencies of small files on HDFS. File merging and prefetching scheme is applied for structurally-related small files, while file grouping and prefetching scheme is used for managing logically-related small files. Experimental results demonstrate that the proposed schemes effectively improve the storage and access efficiencies of small files, compared with native HDFS and a Hadoop file archiving facility.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Network and Computer Applications - Volume 35, Issue 6, November 2012, Pages 1847–1862
نویسندگان
, , , , , ,