کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4374762 1617200 2016 22 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Handling high-dimensional data in air pollution forecasting tasks
ترجمه فارسی عنوان
مدیریت داده های با ابعاد بزرگ در وظایف پیش بینی آلودگی هوا
کلمات کلیدی
موضوعات مرتبط
علوم زیستی و بیوفناوری علوم کشاورزی و بیولوژیک بوم شناسی، تکامل، رفتار و سامانه شناسی
چکیده انگلیسی


• Paper provides a comparative study of various feature extraction methods applied to real world data.
• We use 16 methods of dimensionality reduction and fractional distances.
• Fractional distances exhibit superior performance.
• Isomap, Landmark Isomap and Factor Analysis can be used to formulate universal mappings.

In the paper methods aimed at handling high-dimensional weather forecasts data used to predict the concentrations of PM10, PM2.5, SO2, NO, CO and O3 are being proposed. The procedure employed to predict pollution normally requires historical data samples for a large number of points in time — particularly weather forecast data, actual weather data and pollution data. Likewise, it typically involves using numerous features related to atmospheric conditions. Consequently the analysis of such datasets to generate accurate forecasts becomes very cumbersome task. The paper examines a variety of unsupervised dimensionality reduction methods aimed at obtaining compact yet informative set of features. As an alternative, approach using fractional distances for data analysis tasks is being considered as well. Both strategies were evaluated on real-world data obtained from the Institute of Meteorology and Water Management in Katowice (Poland), with extended Air Pollution Forecast Model (e-APFM) being used as underlying prediction tool. It was found that employing fractional distance as a dissimilarity measure ensures the best accuracy of forecasting. Satisfactory results can be also obtained with Isomap, Landmark Isomap and Factor Analysis as dimensionality reduction techniques. These methods can be also used to formulate universal mapping, ready-to-use for data gathered at different geographical areas.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Ecological Informatics - Volume 34, July 2016, Pages 70–91
نویسندگان
, ,