کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
414509 680969 2016 5 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Analysis of a Network IO Bottleneck in Big Data Environments Based on Docker Containers
ترجمه فارسی عنوان
تجزیه و تحلیل یک گلوگاه IO شبکه در محیط های بزرگ داده بر اساس کانتینرهای داکر
کلمات کلیدی
کانتینرها؛ تعویض متن؛ داكر؛ هادوپ؛ کاهش نقشه
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
چکیده انگلیسی

We live in a world increasingly driven by data with more information about individuals, companies and governments available than ever before. Now, every business is powered by Information Technology and generating Big data. Future Business Intelligence can be extracted from the big data. NoSQL [1] and Map-Reduce [2] technologies find an efficient way to store, organize and process the big data using Virtualization and Linux Container (a.k.a. Container) [3] technologies.Provisioning containers on top of virtual machines is a better model for high resource utilization. As the more containers share the same CPU, the context switch latency for each container increases significantly. Such increase leads to a negative impact on the network IO throughput and creates a bottleneck in the big data environments.As part of this paper, we studied container networking and various factors of context switch latency. We evaluate Hadoop benchmarks [5] against the number of containers and virtual machines. We observed a bottleneck where Hadoop [4] cluster throughput is not linear with the number of nodes sharing the same CPU. This bottleneck is due to virtual network layers which adds a significant delay to Round Trip Time (RTT) of data packets. Future work of this paper can be extended to analyze the practical implications of virtual network layers and a solution to improve the performance of big data environments based on containers.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Big Data Research - Volume 3, April 2016, Pages 24–28
نویسندگان
, , , ,