کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4970087 1450026 2017 12 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A Windowing strategy for Distributed Data Mining optimized through GPUs
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله
A Windowing strategy for Distributed Data Mining optimized through GPUs
چکیده انگلیسی
This paper introduces an optimized Windowing based strategy for inducing decision trees in Distributed Data Mining scenarios. Windowing consists in selecting a sample of the available training examples (the window) to induce a decision tree with an usual algorithm, e.g., J48; finding instances not covered by this tree (counter examples) in the remaining training examples, adding them to the window to induce a new tree; and repeating until a termination criterion is met. In this way, the number of training examples required to induce the tree is reduced considerably, while maintaining the expected accuracy levels; which is paid in terms of time performance. Our proposed enhancements solve this by searching for counter examples on GPUs and further reducing their number in the window. The resulting strategy is implemented in JaCa-DDM, our agents & artifacts tool for Distributed Data Mining, keeping the benefits of Windowing, while distributing the process and being faster than the traditional centralized approach, even performing similarly to Bagging and Random Forests in some cases. Experiments in data mining tasks are addressed, including a case study on pixel-based segmentation for the detection of precancerous cervical lesions on medical images.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 93, 1 July 2017, Pages 23-30
نویسندگان
, , , , ,