کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
1548972 997766 2008 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Gene association study with SVM, MLP and cross-validation for the diagnosis of diseases
موضوعات مرتبط
مهندسی و علوم پایه مهندسی مواد مواد الکترونیکی، نوری و مغناطیسی
پیش نمایش صفحه اول مقاله
Gene association study with SVM, MLP and cross-validation for the diagnosis of diseases
چکیده انگلیسی

Gene association study is one of the major challenges of biochip technology both for gene diagnosis where only a gene subset is responsible for some diseases, and for the treatment of the curse of dimensionality which occurs especially in DNA microarray datasets where there are more than thousands of genes and only a few number of experiments (samples). This paper presents a gene selection method by training linear support vector machine (SVM)/nonlinear MLP (multilayer perceptron) classifiers and testing them with cross-validation for finding a gene subset which is optimal/suboptimal for the diagnosis of binary/multiple disease types. Genes are selected with linear SVM classifier for the diagnosis of each binary disease types pair and tested by leave-one-out cross-validation; then, genes in the gene subset initialized by the union of them are deleted one by one by removing the gene which brings the greatest decrease of the generalization power, for samples, on the gene subset after removal, where generalization is measured by training MLPs with leave-one-out and leave-four-out cross-validations. The proposed method was tested with experiments on real DNA microarray MIT data and NCI data. The result shows that it outperforms conventional SNR method in the separability of the data with expression levels on selected genes. For real DNA microarray MIT/NCI data, which is composed of 7129/2308 effective genes with only 72/64 labeled samples belonging to 2/4 disease classes, only 11/6 genes are selected to be diagnostic genes. The selected genes are tested by the classification of samples on these genes with SVM/MLP with leave-one-out/both leave-one-out and leave-four-out cross-validations. The result of no misclassification indicates that the selected genes can be really considered as diagnostic genes for the diagnosis of the corresponding diseases.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Progress in Natural Science - Volume 18, Issue 6, 10 June 2008, Pages 741–750
نویسندگان
, , ,