کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4942209 1437160 2017 9 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Intelligent computational model for classification of sub-Golgi protein using oversampling and fisher feature selection methods
ترجمه فارسی عنوان
مدل محاسباتی هوشمند برای طبقه بندی پروتئین زیر گلژی با استفاده از روش انتخاب فراوانی و انتخاب ماهیگیران
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی
Golgi is one of the core proteins of a cell, constitutes in both plants and animals, which is involved in protein synthesis. Golgi is responsible for receiving and processing the macromolecules and trafficking of newly processed protein to its intended destination. Dysfunction in Golgi protein is expected to cause many neurodegenerative and inherited diseases that may be cured well if they are detected effectively and timely. Golgi protein is categorized into two parts cis-Golgi and trans-Golgi. The identification of Golgi protein via direct method is very hard due to limited available recognized structures. Therefore, the researchers divert their attention toward the sequences from structures. However, owing to technological advancement, exploration of huge amount of sequences was reported in the databases. So recognition of large amount of unprocessed data using conventional methods is very difficult. Therefore, the concept of intelligence was incorporated with computational model. Intelligence based computational model obtained reasonable results, but the gap of improvement is still under consideration. In this regard, an intelligent automatic recognition model is developed in order to enhance the true classification rate of sub-Golgi proteins. In this approach, discrete and evolutionary feature extraction methods are applied on the benchmark Golgi protein datasets to excerpt salient, propound and variant numerical descriptors. After that, an oversampling technique Syntactic Minority over Sampling Technique is employed to balance the data. Hybrid spaces are also generated with combination of these feature spaces. Further, Fisher feature selection method is utilized to reduce the extra noisy and redundant features from feature vector. Finally, k-nearest neighbor algorithm is used as learning hypothesis. Three distinct cross validation tests are used to examine the stability and efficiency of the proposed model. The predicted outcomes of proposed model are better than the existing models in the literature so far. Finally, it is anticipated that the proposed model will provide the foundation to pharmaceutical industry in drug design and research community to innovate new ideas in the area of computational biology and bioinformatics.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Artificial Intelligence in Medicine - Volume 78, May 2017, Pages 14-22
نویسندگان
, , ,