Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
453389 | Computer Standards & Interfaces | 2013 | 8 Pages |
In this paper we propose an efficient and scalable storage model and lookup for provenance logs. The proposed system exploits the loosely coupled structure of the provenance logs by separating metadata from the generating process to manage large datasets with good scalability. In addition, the system utilizes the trie based lookup table to greatly improve the provenance data lookup time. Performance results on thousands of graph logs show that our prototype implementation can effectively handle logs without any resource over-utilization, thus leading to good scalability.
► A distributed provenance storage model is developed. ► The storage model manages large datasets with good scalability. ► The storage model provides long term persistence for provenance metadata. ► The storage model greatly improves the provenance data lookup time.