Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
392042 | Information Sciences | 2015 | 13 Pages |
•An approach to semantically retrieving scientific workflows with cost constraints.•A CCG model describing the semantics and cost constraints of scientific workflows.•A distance metric comparing scientific workflows by datatypes and cost constraints.•The related cases and evaluation show that our approach is effective and efficient.•A visual tool SciWFRC for editing, storing and comparing scientific workflows.
Scientific workflows have been applied in many scientific areas with the large amount of complex data computation tasks such as life science, astronomy and earth science, etc. However, most existing approaches for scientific workflow retrieval neglect some constraints of quality of services (QoS) that users are really concerned about, and fail to allow users to express and retrieve scientific workflows with arbitrary constraints based on graph structures of workflows. In this paper, we propose a novel approach for scientific workflow retrieval with cost constraints. We present a graph representation model called Cost Constrained Graph (CCG) for representing scientific workflows with cost constraints. A distance measure is defined for accurate workflow retrieval. The CCGs representing candidate workflows can be ranked by comparing the similarity among them. We also theoretically prove that this measure satisfies all the four properties of distance. Furthermore, we develop a prototype system for editing, assignment of weights, and automatic similarity computation of workflows. At last, the related experiments are made to demonstrate the usefulness and efficiency of workflow retrieval based on our approach.