کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
416666 681393 2006 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Power analysis of database search using multiple scoring matrices
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
Power analysis of database search using multiple scoring matrices
چکیده انگلیسی

Protein sequence alignment may be viewed as either a classification or a multiple hypothesis testing problem. Whereas the type one error of a method is often studied for randomly generated sequences, the power is best investigated based on real protein sequences. The SCOP data base and its protein classification is used to investigate both the power and the type one error of sequence alignment as provided by BLAST. The focus is on the multiple testing case when more than one scoring matrix is used. It is demonstrated that a multiple testing correction needs to be applied in order to control the number of false positives while using more than one scoring matrix. It is also shown that a proper search procedure based on multiple scoring matrices detects slightly fewer homologous sequences present in the SCOP data base than the matrix BLOSUM62 itself, while giving the opportunity of detecting a wider variety of homologous types.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computational Statistics & Data Analysis - Volume 51, Issue 3, 1 December 2006, Pages 1656–1663
نویسندگان
, , ,