Article ID Journal Published Year Pages File Type
10768884 Biochemical and Biophysical Research Communications 2005 5 Pages PDF
Abstract
The group of 2502 transmembrane (TM) protein sequences with seven TM segments (7-tms) registered in SWISS-PROT 46.0 contains 2200 G-protein-coupled receptors (GPCRs), indicating that GPCR candidates can be detected with a reliability of 87.9% in the eukaryotic genomes merely by correctly predicting the number of TM segments as 7-tms. The predictive accuracies of TM topology-prediction methods proposed so far are not as high as expected; even the best method, HMMTOP 2.0, can only achieve a capture rate of 7-tms sequences of 77.6%. It is necessary to improve this performance as much as possible, even if by only a few percentage points, in order to identify as many novel GPCR candidate genes as possible among the increasing number of newly sequenced genomes. In this study, we propose a simple but useful prediction method for detecting as many 7-tms TM protein sequences as GPCR candidates in eukaryotic genomes as possible. This is achieved by employing a two-step prediction procedure. The first step involves collecting 7-tms sequences by the best prediction method (HMMTOP 2.0), and the second involves picking up the remaining 7-tms sequences by the second-best method (TMHMM 2.0). By this procedure, the capture rate of 7-tms TM protein sequences in SWISS-PROT can be improved considerably from 77.6% to 84.5%, and the number of GPCR candidate sequences predicted as 7-tms in the human genome (Build 35) is increased from 790 (by HMMTOP 2.0) to 903. These 790 and 903 candidate sequences include, respectively, 587 and 636 of the known human GPCRs of the 717 registered in SWISS-PROT 46.0, demonstrating that the proposed combinatorial method is effective in detecting GPCR candidate genes in eukaryotic genomes.
Related Topics
Life Sciences Biochemistry, Genetics and Molecular Biology Biochemistry
Authors
, , ,