کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
383836 660834 2010 9 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Association classification algorithm based on structure sequence in protein secondary structure prediction
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Association classification algorithm based on structure sequence in protein secondary structure prediction
چکیده انگلیسی

ObjectiveTo propose a novel associate classification algorithm SAC (structural association classification) and develop a compound pyramid model for accurate and precise protein secondary structure prediction.MethodBased on the slide window theory, the protein sequence was treated as a window with length of 13, in which the target amino acid resided in the center, while the remaining area was targeted as secondary amino acid structures. To the head and tail of the sequence, the mirror method was employed to fill the space with an opposite- position structure in relation to the central position.In the mining process, the KDD∗ model not only focuses on the high support and confidence rules, but also pay attention to high confidence and low support rules, which is called ‘knowledge in shortage’.Towards the end of the mining process, sets H, E and C, consisted of rule sets whose consequents are α-helix, β-sheet and C-coil, were created respectively to meet the basic requirements for the protein secondary structure prediction. The knowledge base of protein secondary structure was then established with these three newly-acquired rule sets. Through the CMAR (Classification based on Multiple Association rules) algorithm, a novel multi-classifier was developed to determine the best likelihood of a given window to the secondary structure through the adjacent information on amino acid sequential window and screening of three different rule sets.ResultThe protein knowledge base consisted of 8049 rules corresponding to sets H, E and C   with 2642, 1895 and 3512 rules, respectively, was obtained. Experiment shows, theoretically, accuracy ratio exceeded 85% when confidence threshold value was 70% and 90%. Through the classification process using the multi-classifier SAC developed in four experiments, the significantly high accuracy and recall ratios up to 83.06% (According to Q3Q3 criterion, followed by abbreviation) in RS126 ( Chen and Chaudhari, 2007, Guo et al., 2004, Hu et al., 2004 and Liu et al., 2004) and 80.49% in CB513 ( Guo et al., 2004, Liu et al., 2004 and Wang and Liu, 2004), respectively, were demonstrated.ConclusionThe structural association classification algorithm with pyramid classification developed in the present study demonstrated significantly high accuracy in the protein secondary structure prediction. The study results suggest a highly reliable and accurate alternative in the contemporary protein structure prediction.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 37, Issue 9, September 2010, Pages 6381–6389
نویسندگان
, , ,