کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
533671 870151 2016 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Toward a generic representation of random variables for machine learning
ترجمه فارسی عنوان
به نمای کلی متغیرهای تصادفی برای یادگیری ماشین
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
چکیده انگلیسی


• Introduce a non-parametric representation of i.i.d. stochastic processes.
• The presented pre-processing boosts performance of algorithms.
• Clusterings of financial time series become more stable.
• Prices clustering allows one to recover idiosyncratic risk.
• Experiments results available at www.datagrapple.com.

This paper presents a pre-processing and a distance which improve the performance of machine learning algorithms working on independent and identically distributed stochastic processes. We introduce a novel non-parametric approach to represent random variables which splits apart dependency and distribution without losing any information. We also propound an associated metric leveraging this representation and its statistical estimate. Besides experiments on synthetic datasets, the benefits of our contribution is illustrated through the example of clustering financial time series, for instance prices from the credit default swaps market. Results are available on the website http://www.datagrapple.com and an IPython Notebook tutorial is available at http://www.datagrapple.com/Tech for reproducible research.

Figure optionsDownload high-quality image (277 K)Download as PowerPoint slide

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 70, 15 January 2016, Pages 24–31
نویسندگان
, , ,