کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
11263882 1645578 2018 18 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A deep hybrid model to detect multi-locus interacting SNPs in the presence of noise
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
A deep hybrid model to detect multi-locus interacting SNPs in the presence of noise
چکیده انگلیسی
Identifying genetic variants associated with complex diseases is a central focus of genome-wide association studies. These studies extensively adopt univariate analysis by ignoring interaction effects. It is widely accepted that the etiology of most complex diseases depends on interactions between genetic variants and / or environmental factors. Several machine learning and data mining methods have been consistently successful in exposing these interaction effects. However, there has been no major breakthrough due to various biological complexities, and statistical computational challenges facing in the field of genetic epidemiology, despite of many efforts. Deep learning is emerging machine learning approach that promises to reveal the hidden patterns of big data for accurate predictions. In this study, a deep neural network is unified with a random forest by forming hybrid architecture, for achieving reliable detection of multi-locus interactions between single nucleotide polymorphisms. The proposed hybrid method is evaluated on various simulated scenarios in the absence of main effect for six epistasis models. The best model with optimal hyper-parameters (grid and random grid search) is chosen to enhance the power of the method by maximising the model's prediction accuracy. The performance metrics of each model is analysed for both training and validation. Further, the performance of the method in the presence of noise due to missing data, genotyping errors, genetic heterogeneity, and phenocopy, and their combined effects are evaluated. The power of the method in detecting two-locus interactions is compared with the previous methods in the presence and absence of noise. On an average, the power of the proposed method is much higher than the previous methods for all simulated scenarios. Finally, findings are confirmed on a chronical dialysis patient's data, obtained from the published study performed at the Kaohsiung Chang Gung Memorial Hospital. It is observed that the interaction between SNP 21 (2) and SNP 28 (2) in the mitochondrial D-loop has the highest risk for the disease manifestation.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: International Journal of Medical Informatics - Volume 119, November 2018, Pages 134-151
نویسندگان
, ,