Article ID Journal Published Year Pages File Type
424783 Future Generation Computer Systems 2016 14 Pages PDF
Abstract

•Studies the effect of ineffective parallelism and system overheads on workflow execution in distributed environments.•Establishes requirement of a task clustering technique which realizes maximum possible parallelism while minimizing system overheads and resource wastage.•Proposes an autonomic Workflow-and-Platform Aware (WPA) task clustering technique.•Evaluates the proposed method with state-of-art algorithms on four scientific workflows.

A scientific workflow, usually consists of a good mix of fine and coarse computational granularity tasks displaying varied runtime requirements. It has been observed that fine grained tasks incur more scheduling overhead than their execution time, when executed on widely distributed platforms. Task clustering is extensively used, in such situations, as a runtime optimization method which involves combining multiple short duration tasks into a cluster, to be scheduled on a single resource. This helps in minimizing the scheduling overheads of the fine grained tasks. However, tasks grouping curtails the degree of parallelism and hence needs to be done optimally. Though a number of task clustering techniques have been developed to reduce the impact of system overheads, they fail to identify the appropriate number of clusters at each level of workflow in order to achieve maximum possible parallelism. This work proposes a level based autonomic Workflow-and-Platform Aware (WPA) task clustering technique which takes into consideration both; the workflow structure and the underlying resource set size for task clustering. It aims to achieve maximum possible parallelism among the tasks at a level of a workflow while minimizing the system overheads and resource wastage. A comparative study with current state of the art task clustering approaches on four well-known scientific workflows show that the proposed method significantly reduces the overall workflow execution time and at the same time is able to consolidate the load onto minimum possible resources.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, ,