کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
392985 665212 2015 21 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Asymptotic scheduling for many task computing in Big Data platforms
ترجمه فارسی عنوان
برنامه ریزی هماهنگ برای بسیاری از محاسبات کار در سیستم های بزرگ داده
کلمات کلیدی
زمان بندی هماهنگ محاسبات بسیاری از کارها، پردازش ابری، سیستم عامل های بزرگ، شبیه سازی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی

Due to the advancement of technology the datasets that are being processed nowadays in modern computer clusters extend beyond the petabyte scale – the 4 detectors of the Large Hadron Collider at CERN produced several petabytes of data in 2011. Large scale computing solutions are increasingly used for genome sequencing tasks in the Human Genome Project. In the context of Big Data platforms, efficient scheduling algorithms play an essential role. This paper deals with the problem of scheduling a set of jobs across a set of machines and specifically analyzes the behavior of the system at very high loads, which is specific to Big Data processing. We show that under certain conditions we can easily discover the best scheduling algorithm, prove its optimality and compute its asymptotic throughput. We present a simulation infrastructure designed especially for building/analyzing different types of scenarios. This allows to extract scheduling metrics for three different algorithms (the asymptotically optimal one, FCFS and a traditional GA-based algorithm) in order to compare their performance. We focus on the transition period from low incoming job rates load to the very high load and back. Interestingly, all three algorithms experience a poor performance over the transition periods. Since the Asymptotically Optimal algorithm makes the assumption of an infinite number of jobs it can be used after the transition, when the job buffers are saturated. As the other scheduling algorithms do a better job under reduced load, we will combine them into a single hybrid algorithm and empirically determine what is the best switch point, offering in this way an asymptotic scheduling mechanism for many task computing used in Big Data processing platforms.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Sciences - Volume 319, 20 October 2015, Pages 71–91
نویسندگان
, ,