کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
15084 1374 2014 6 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Inferring biological basis about psychrophilicity by interpreting the rules generated from the correctly classified input instances by a classifier
ترجمه فارسی عنوان
مبانی پایه بیولوژیکی در مورد روان شناسی را با تفسیر قوانین تولید شده از نمونه های ورودی به درستی طبقه بندی شده توسط یک طبقه بندی
کلمات کلیدی
قوانین قابل تفسیر بیولوژیکی، سازگاری سرد الگوهای ترکیب آمینو اسید، جنگل چرخش روش القایی قانون جزء
موضوعات مرتبط
مهندسی و علوم پایه مهندسی شیمی بیو مهندسی (مهندسی زیستی)
چکیده انگلیسی


• Enhanced classification of psychrophilic proteins by rotation forest.
• Rule extraction from correctly classified sequences.
• Validation of generated rules on structural data.
• Biological interpretation of rules.
• Ranking of amino acids according to their discriminative ability.

Organisms thriving at extreme cold surroundings are called as psychrophiles and they present a wealth of knowledge about sequence adjustments in proteins that had occurred during the adaptation to low temperatures. In this paper, we propose a new cascading model to investigate the basis for psychrophilicity. In this model, a superior classifier was used to discriminate psychrophilic from mesophilic protein sequences, and then the PART rule generating algorithm was applied on the input instances that are correctly classified by the classifier, to generate human interpretable rules. These derived rules were further validated on a structural dataset and finally analyzed to discover the underlying biological basis about the psychrophilicity. In this study, we have used one of the key features of psychrophilic proteins accountable for remaining functional in extreme cold temperature surroundings i.e., global patterns of amino acid composition as the input features. The rotation forest classifier outperformed all the other classifiers with maximum accuracy of 70.5% and maximum AUC of 0.78. The effect of sequence length on the classification accuracy was also investigated. The analysis of the derived rules and interpretation of the analyzed results had revealed some interesting phenomena such as the amino acids A, D, G, F, and S are over-represented, and T is under-represented in psychrophilic proteins. These findings augment the existing domain knowledge for psychrophilic sequence features.

Figure optionsDownload as PowerPoint slide

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computational Biology and Chemistry - Volume 53, Part B, December 2014, Pages 198–203
نویسندگان
, ,