دانلود رایگان مقاله: تعیین کمیت ارزش غربال اطلاعات سطح کاربر برای داده های بزرگ: مطالعه موردی با استفاده از مدل توزیع پستانداران

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
4374768	1617200	2016	7 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Quantifying the value of user-level data cleaning for big data: A case study using mammal distribution models

ترجمه فارسی عنوان

تعیین کمیت ارزش غربال اطلاعات سطح کاربر برای داده های بزرگ: مطالعه موردی با استفاده از مدل توزیع پستانداران

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

انفورماتیک تنوع زیستی؛ داده تمیز کردن. عملکرد SDM؛ MAXENT؛ پستانداران استرالیا؛ اطلاعات بزرگ

MaxEnt Big-data - اطلاعات بزرگ Biodiversity informatics - تنوع زیستی

موضوعات مرتبط

علوم زیستی و بیوفناوری علوم کشاورزی و بیولوژیک بوم شناسی، تکامل، رفتار و سامانه شناسی

پیش نمایش مقاله

تعیین کمیت ارزش غربال اطلاعات سطح کاربر برای داده های بزرگ: مطالعه موردی با استفاده از مدل توزیع پستانداران

چکیده انگلیسی

• User-level data cleaning is seldom applied to biodiversity databases.
• We present a new framework to quantify the effect of data cleaning on SDMs.
• Data cleaning resulted in significant improvement in SDMs across all studied scales.
• The largest SDM improvement following data cleaning was for small mammals (1 g–100 g).
• We exemplify the value of case-specific, user-level data cleaning.

The recent availability of species occurrence data from numerous sources, standardized and connected within a single portal, has the potential to answer fundamental ecological questions. These aggregated big biodiversity databases are prone to numerous data errors and biases. The data-user is responsible for identifying these errors and assessing if the data are suitable for a given purpose. Complex technical skills are increasingly required for handling and cleaning biodiversity data, while biodiversity scientists possessing these skills are rare. Here, we estimate the effect of user-level data cleaning on species distribution model (SDM) performance. We implement several simple and easy-to-execute data cleaning procedures, and evaluate the change in SDM performance. Additionally, we examine if a certain group of species is more sensitive to the use of erroneous or unsuitable data. The cleaning procedures used in this research improved SDM performance significantly, across all scales and for all performance measures. The largest improvement in distribution models following data cleaning was for small mammals (1 g–100 g). Data cleaning at the user level is crucial when using aggregated occurrence data, and facilitating its implementation is a key factor in order to advance data-intensive biodiversity studies. Adopting a more comprehensive approach for incorporating data cleaning as part of data analysis, will not only improve the quality of biodiversity data, but will also impose a more appropriate usage of such data.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Ecological Informatics - Volume 34, July 2016, Pages 139–145

نویسندگان

Tomer Gueta, Yohay Carmel,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : تعیین کمیت ارزش غربال اطلاعات سطح کاربر برای داده های بزرگ: مطالعه موردی با استفاده از مدل توزیع پستانداران

دسترسی سریع

ارتباط

English Website