کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6874344 1441159 2018 23 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Big data driven outlier detection for soybean straw near infrared spectroscopy
ترجمه فارسی عنوان
شناسایی بیگانه داده های بزرگ برای کاه سویا در نزدیکی طیف سنجی مادون قرمز
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
چکیده انگلیسی
In near infrared spectroscopy (NIRS) analysis, the prediction ability of the model is seriously affected by outliers that may be the result of errors related to the spectral measurements, the chemical analysis, or a combination of both. In this paper, an outlier detection method is described based on the NIRS analysis data of soybean straw. We improved the resampling by half-mean (RHM) method by including a confidence interval (IRHM) and combined the IRHM and Cook's distance methods (IRHM-COOK) to detect outlier samples in the NIRS data. The confidence interval is an important parameter in the IRHM-COOK method and the optimal confidence intervals for the IRHM and Cook's distance methods are combined and used as the confidence interval for the IRHM-COOK method. The selection process for the confidence interval is aimed at relative independence between the detection of the spectrum outliers and the chemical outliers. The experimental results show that the IRHM-COOK method is superior to the traditional Mahalanobis distance method, the IRHM method, and the Cook's distance method using a partial least squares regression (PLS) model. The determination coefficient (R2) of a hemicellulose PLS calibration model increased from 0.4397918 to 0.5333039 and the root mean square error (RMSE) decreased from 0.7926415 to 0.7287254. The PLS models for lignin and cellulose performed better using the IRHM-COOK method than the original model. The results show that the IRHM-COOK method can effectively identify spectrum outliers and chemical outliers for soybean straw biomass. In addition, it is an effective method to handle NIRS analysis data with one type of outlier, which is proven based on an NIRS analysis of starch.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Computational Science - Volume 26, May 2018, Pages 178-189
نویسندگان
, , , , , , ,