Perceptron ensemble of graph-based positive-unlabeled learning for disease gene identification

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
14905	1360	2016	8 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Biological networks - شبکه های بیولوژیکی Ensemble of classifiers - مجموعه ای از طبقه بندی کنندگان Perceptron - پروپترون

موضوعات مرتبط

مهندسی و علوم پایه مهندسی شیمی بیو مهندسی (مهندسی زیستی)

پیش نمایش صفحه اول مقاله

Perceptron ensemble of graph-based positive-unlabeled learning for disease gene identification

چکیده انگلیسی

• Identification of disease genes in semi-supervised learning methods, called positive-unlabeled learning.
• In this paper, we present a Perceptron ensemble of graph-based positive-unlabeled learning (PEGPUL) on three types of biological attributes: gene ontologies, protein domains and protein-protein interaction networks.
• A Perceptron ensemble is learned from three weighted classifiers: multilevel support vector machine, k-nearest neighbor and decision tree.
• The main contributions of this paper are: (i) incorporating the statistical properties of gene data through choosing proper metrics, (ii) statistical evaluation of biological features, and (iii) noise robustness characteristic of PEGPUL via using multilevel schema. In order to assess PEGPUL, we have applied it on 12,950 disease genes with 949 positive genes from six class of diseases and 12,001 unlabeled genes.
• Compared with some popular disease gene identification methods, the experimental results show that PEGPUL has reasonable performance.

Identification of disease genes, using computational methods, is an important issue in biomedical and bioinformatics research. According to observations that diseases with the same or similar phenotype have the same biological characteristics, researchers have tried to identify genes by using machine learning tools. In recent attempts, some semi-supervised learning methods, called positive-unlabeled learning, is used for disease gene identification. In this paper, we present a Perceptron ensemble of graph-based positive-unlabeled learning (PEGPUL) on three types of biological attributes: gene ontologies, protein domains and protein-protein interaction networks. In our method, a reliable set of positive and negative genes are extracted using co-training schema. Then, the similarity graph of genes is built using metric learning by concentrating on multi-rank-walk method to perform inference from labeled genes. At last, a Perceptron ensemble is learned from three weighted classifiers: multilevel support vector machine, k-nearest neighbor and decision tree. The main contributions of this paper are: (i) incorporating the statistical properties of gene data through choosing proper metrics, (ii) statistical evaluation of biological features, and (iii) noise robustness characteristic of PEGPUL via using multilevel schema. In order to assess PEGPUL, we have applied it on 12950 disease genes with 949 positive genes from six class of diseases and 12001 unlabeled genes. Compared with some popular disease gene identification methods, the experimental results show that PEGPUL has reasonable performance.

Figure optionsDownload as PowerPoint slide

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computational Biology and Chemistry - Volume 64, October 2016, Pages 263–270

نویسندگان

Gholam-Hossein Jowkar, Eghbal G. Mansoori,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Perceptron ensemble of graph-based positive-unlabeled learning for disease gene identification

دسترسی سریع

ارتباط

English Website