کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
2189264 1096204 2006 29 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Partitioning Protein Structures into Domains: Why Is it so Difficult?
موضوعات مرتبط
علوم زیستی و بیوفناوری بیوشیمی، ژنتیک و زیست شناسی مولکولی بیولوژی سلول
پیش نمایش صفحه اول مقاله
Partitioning Protein Structures into Domains: Why Is it so Difficult?
چکیده انگلیسی

This analysis takes an in-depth look into the difficulties encountered by automatic methods for domain decomposition from three-dimensional structure. The analysis involves a multi-faceted set of criteria including the integrity of secondary structure elements, the tendency toward fragmentation of domains, domain boundary consistency and topology. The strength of the analysis comes from the use of a new comprehensive benchmark dataset, which is based on consensus among experts (CATH, SCOP and AUTHORS of the 3D structures) and covers 30 distinct architectures and 211 distinct topologies as defined by CATH. Furthermore, over 66% of the structures are multi-domain proteins; each domain combination occurring once per dataset. The performance of four automatic domain assignment methods, DomainParser, NCBI, PDP and PUU, is carefully analyzed using this broad spectrum of topology combinations and knowledge of rules and assumptions built into each algorithm. We conclude that it is practically impossible for an automatic method to achieve the level of performance of human experts. However, we propose specific improvements to automatic methods as well as broadening the concept of a structural domain. Such work is prerequisite for establishing improved approaches to domain recognition. (The benchmark dataset is available from http://pdomains.sdsc.edu).

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Molecular Biology - Volume 361, Issue 3, 18 August 2006, Pages 562–590
نویسندگان
, , , ,