دانلود رایگان مقاله: محاسبه مجدد انتخابی و تکراری مجدد داده های تحلیل داده های بزرگ: بینش های یک مطالعه موردی ژنومیک

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
10225734	1701206	2018	19 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Selective and Recurring Re-computation of Big Data Analytics Tasks: Insights from a Genomics Case Study

ترجمه فارسی عنوان

محاسبه مجدد انتخابی و تکراری مجدد داده های تحلیل داده های بزرگ: بینش های یک مطالعه موردی ژنومیک

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

دوباره محاسبه، تخریب دانش، تجزیه و تحلیل داده های بزرگ، ژنومیکس،

Big data analysis - تجزیه و تحلیل داده های بزرگ Genomics - ژنومیک

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات

پیش نمایش مقاله

محاسبه مجدد انتخابی و تکراری مجدد داده های تحلیل داده های بزرگ: بینش های یک مطالعه موردی ژنومیک

چکیده انگلیسی

The value of knowledge assets generated by analytics processes using Data Science techniques tends to decay over time, as a consequence of changes in the elements the process depends on: external data sources, libraries, and system dependencies. For large-scale problems, refreshing those outcomes through greedy re-computation is both expensive and inefficient, as some changes have limited impact. In this paper we address the problem of refreshing past process outcomes selectively, that is, by trying to identify the subset of outcomes that will have been affected by a change, and by only re-executing fragments of the original process. We propose a technical approach to address the selective re-computation problem by combining multiple techniques, and present an extensive experimental study in Genomics, namely variant calling and their clinical interpretation, to show its effectiveness. In this case study, we are able to decrease the number of required re-computations on a cohort of individuals from 495 (blind) down to 71, and that we can reduce runtime by at least 60% relative to the naïve blind approach, and in some cases by 90%. Starting from this experience, we then propose a blueprint for a generic re-computation meta-process that makes use of process history metadata to make informed decisions about selective re-computations in reaction to a variety of changes in the data.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Big Data Research - Volume 13, September 2018, Pages 76-94

نویسندگان

Jacek CaÅa, Paolo Missier,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : محاسبه مجدد انتخابی و تکراری مجدد داده های تحلیل داده های بزرگ: بینش های یک مطالعه موردی ژنومیک

دسترسی سریع

ارتباط

English Website