Article ID Journal Published Year Pages File Type
6921597 Computers in Biology and Medicine 2014 9 Pages PDF
Abstract
Since the fast development of genome sequencing has produced large scale data, the current work uses the bioinformatics methods to recognize different gene regions, such as exon, intron and promoter, which play an important role in gene regulations. In this paper, we introduce a new method based on the maximum entropy Markov model (MEMM) to recognize the promoter, which utilizes the biological features of the promoter for the condition. However, it leads to a high false positive rate (FPR). In order to reduce the FPR, we provide another new method based on the maximum entropy hidden Markov model (ME-HMM) without the independence assumption, which could also accommodate the biological features effectively. To demonstrate the precision, the new methods are implemented by R language and the hidden Markov model (HMM) is introduced for comparison. The experimental results show that the new methods may not only overcome the shortcomings of HMM, but also have their own advantages. The results indicate that, MEMM is excellent for identifying the conserved signals, and ME-HMM can demonstrably improve the true positive rate.
Keywords
Related Topics
Physical Sciences and Engineering Computer Science Computer Science Applications
Authors
, , , , , , ,