Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
431951 | Journal of Parallel and Distributed Computing | 2011 | 12 Pages |
Supporting high-performance data-intensive computing pipelines in wide-area networks is crucial for enabling large-scale distributed scientific applications that require minimizing end-to-end delay for single-input applications or maximizing frame rate for streaming applications. We formulate and categorize the data-intensive computing pipeline mapping problems into six classes with two optimization objectives, i.e. minimum end-to-end delay and maximum frame rate, and three network constraints, i.e. no, contiguous, and arbitrary node reuse. We design a dynamic programming-based optimal solution to the problem of minimum end-to-end delay with arbitrary node reuse and prove the NP-completeness of the rest five problems, for each of which, a heuristic algorithm based on a similar optimization procedure is proposed. These heuristics are implemented and tested on a large set of simulated pipelines and networks of various scales and their performance superiority is illustrated by extensive simulation results in comparison with existing methods.