کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
2823150 1161375 2011 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
EuPathDomains: The divergent domain database for eukaryotic pathogens
موضوعات مرتبط
علوم زیستی و بیوفناوری علوم کشاورزی و بیولوژیک بوم شناسی، تکامل، رفتار و سامانه شناسی
پیش نمایش صفحه اول مقاله
EuPathDomains: The divergent domain database for eukaryotic pathogens
چکیده انگلیسی

Eukaryotic pathogens (e.g. Plasmodium, Leishmania, Trypanosomes, etc.) are a major source of morbidity and mortality worldwide. In Africa, one of the most impacted continents, they cause millions of deaths and constitute an immense economic burden. While the genome sequence of several of these organisms is now available, the biological functions of more than half of their proteins are still unknown. This is a serious issue for bringing to the foreground the expected new therapeutic targets. In this context, the identification of protein domains is a key step to improve the functional annotation of the proteins.However, several domains are missed in eukaryotic pathogens because of the high phylogenetic distance of these organisms from the classical eukaryote models. We recently proposed a method, co-occurrence domain detection (CODD), that improves the sensitivity of Pfam domain detection by exploiting the tendency of domains to appear preferentially with a few other favorite domains in a protein.In this paper, we present EuPathDomains (http://www.atgc-montpellier.fr/EuPathDomains/), an extended database of protein domains belonging to ten major eukaryotic human pathogens. EuPathDomains gathers known and new domains detected by CODD, along with the associated confidence measurements and the GO annotations that can be deduced from the new domains. This database significantly extends the Pfam domain coverage of all selected genomes, by proposing new occurrences of domains as well as new domain families that have never been reported before. For example, with a false discovery rate lower than 20%, EuPathDomains increases the number of detected domains by 13% in Toxoplasma gondii genome and up to 28% in Cryptospordium parvum, and the total number of domain families by 10% in Plasmodium falciparum and up to 16% in C. parvum genome. The database can be queried by protein names, domain identifiers, Pfam or Interpro identifiers, or organisms, and should become a valuable resource to decipher the protein functions of eukaryotic pathogens.

Research highlights▶ Protein domain co-occurence can be used to increase the sensitivity of domain detection. ▶ EuPathDomains is an extended database of protein domains on several human pathogens. ▶ EuPathDomains increases the number of domains in a proportion of ∼10% in each genome. ▶ EuPathDomains can be queried by protein names, domain IDs, organisms or taxa.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Infection, Genetics and Evolution - Volume 11, Issue 4, June 2011, Pages 698–707
نویسندگان
, , , , , , ,