کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4948278 1439610 2016 27 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
mGOF-loc: A novel ensemble learning method for human protein subcellular localization prediction
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
mGOF-loc: A novel ensemble learning method for human protein subcellular localization prediction
چکیده انگلیسی
To better understand the functions of proteins, it is a critical step to predict their subcellular locations. Recently, numerous computational methods have been developed for protein subcellular localization prediction. Most of existing methods rely on the Gene Ontology (GO) information for feature representation. Although the GO information is proved to be beneficial for the improved predictive performance of the methods in prior research, the following problem is that it generates a super-high dimensional feature space, and the dimension of the feature space will get higher and higher as the number of the terms in the GO database increase. To address this issue, we propose a novel feature representation method sufficiently exploring the sequence evolutional information rather than using the GO information. Using the proposed feature representation method, we generate a comprehensive feature set of 828 features from the following three aspects: physicochemical properties, position-specific score matrix (PSSM), and the k-skip-n-gram model. By featuring a multi-label ensemble classifier with the proposed features, we further develop a novel multi-label learning method, namely mGOF-loc. Results on an updated large-scale dataset distributed with 37 subcellular locations show that mGOF-loc outperforms existing methods. Currently, a webserver that implements mGOF-loc is freely available on http://server.malab.cn/mGOF-loc/Index.html.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Neurocomputing - Volume 217, 12 December 2016, Pages 73-82
نویسندگان
, , , , ,