دانلود رایگان مقاله: نمایندگی موقتی برای علم اطلاعات علمی معدن

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
425270	685710	2014	16 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Temporal representation for mining scientific data provenance

ترجمه فارسی عنوان

نمایندگی موقتی برای علم اطلاعات علمی معدن

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

پروانه، نمایندگی زمان محلی، داده کاوی

Data mining - داده‌کاوی Temporal representation - نمایندگی موقتی Provenance - پروانه

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات

پیش نمایش مقاله

نمایندگی موقتی برای علم اطلاعات علمی معدن

چکیده انگلیسی

• We propose a representation of data provenance using logical time that reduces its feature space.
• This temporal representation supports clustering, classification and association rule mining.
• Analysis of the clustering results shows that the kk-means algorithm gives the best performance.
• We carry out an evaluation against a multi-gigabyte synthetic provenance dataset.
• We also carry out an evaluation against a real provenance dataset gathered from a satellite instrument.

Provenance of digital scientific data is a distinct piece of metadata about a data object. It can serve as a “ground-truth” for determining the cause of execution failure for instance, or can explain a particular result to a researcher intending to reuse a data object. Provenance can quickly grow voluminous and be quite feature rich, requiring new structure and concepts that support data mining. We propose a representation of data provenance using logical time that reduces the feature space of the provenance. The temporal representation supports clustering, classification and association rule mining. This paper studies the full utility of the temporal representation through an empirical evaluation and identification of the data mining algorithms that are most effective in application to the proposed representation. The evaluation is carried out against a multi-gigabyte semi-synthetic provenance dataset built from a range of scientific workflows, and against a real one month provenance dataset gathered from a satellite instrument. Through analysis of the results via clustering metrics—purity and Normalized Mutual Information (NMI), we determine that the kk-means algorithm gives the best clustering with the proposed temporal representation, while still yielding provenance-useful information.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Future Generation Computer Systems - Volume 36, July 2014, Pages 363–378

نویسندگان

Peng Chen, Beth Plale, Mehmet S. Aktas,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : نمایندگی موقتی برای علم اطلاعات علمی معدن

دسترسی سریع

ارتباط

English Website