Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
10225729 | Big Data Research | 2018 | 10 Pages |
Abstract
We run experiments on DNA methylation datasets extracted from The Cancer Genome Atlas, focusing on three tumor types: breast, kidney, and thyroid carcinomas. We perform classifications extracting several methylated sites and their associated genes with accurate performance (accuracy >97%). Results suggest that BIGBIOCL can perform hundreds of classification iterations on hundreds of thousands of features in few hours. Moreover, we compare the performance of our method with other state-of-the-art classifiers and with a wide-spread DNA methylation analysis method based on network analysis. Finally, we are able to efficiently compute multiple alternative classification models and extract - from DNA-methylation large datasets - a set of candidate genes to be further investigated to determine their active role in cancer. BIGBIOCL, results of experiments, and a guide to carry on new experiments are freely available on GitHub at https://github.com/fcproj/BIGBIOCL.
Related Topics
Physical Sciences and Engineering
Computer Science
Computational Theory and Mathematics
Authors
Fabrizio Celli, Fabio Cumbo, Emanuel Weitschek,