کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
15372 1407 2008 4 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A feature vector integration approach for a generalized support vector machine pairwise homology algorithm
موضوعات مرتبط
مهندسی و علوم پایه مهندسی شیمی بیو مهندسی (مهندسی زیستی)
پیش نمایش صفحه اول مقاله
A feature vector integration approach for a generalized support vector machine pairwise homology algorithm
چکیده انگلیسی

Due to the exponential growth of sequenced genomes, the need to quickly provide accurate annotation for existing and new sequences is paramount to facilitate biological research. Current sequence comparison approaches fail to detect homologous relationships when sequence similarity is low. Support vector machine (SVM) algorithms approach this problem by transforming all proteins into a feature space of equal dimension based on protein properties, such as sequence similarity scores against a basis set of proteins or motifs. This multivariate representation of the protein space is then used to build a classifier specific to a pre-defined protein family. However, this approach is not well suited to large-scale annotation. We have developed a SVM approach that formulates remote homology as a single classifier that answers the pairwise comparison problem by integrating the two feature vectors for a pair of sequences into a single vector representation that can be used to build a classifier that separates sequence pairs into homologs and non-homologs. This pairwise SVM approach significantly improves the task of remote homology detection on the benchmark dataset, quantified as the area under the receiver operating characteristic curve; 0.97 versus 0.73 and 0.70 for PSI-BLAST and Basic Local Alignment Search Tool (BLAST), respectively.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computational Biology and Chemistry - Volume 32, Issue 6, December 2008, Pages 458–461
نویسندگان
, , ,