Article ID Journal Published Year Pages File Type
1179387 Chemometrics and Intelligent Laboratory Systems 2015 7 Pages PDF
Abstract

•A novel feature, called PseKNC, was proposed to formulate the DNA sequences.•The overall accuracy of 83.72% was achieved for predicting origin of replication.•A free web-server iORI-PseKNC was constructed.

The initiation of replication origin is an extremely important process of DNA replication. The distribution of replication origin regions (ORIs) is the major determinant of the timing of genome replication. Thus, correctly identifying ORIs is crucial to understand DNA replication mechanism. With the avalanche of genome sequences generated in the post-genomic age, it is highly desired to develop computational methods for rapidly, effectively and automatically identifying the ORIs in genome. In this paper, we developed a predictor called iORI-PseKNC for identifying ORIs in Saccharomyces cerevisiae genome. In the predictor, based on the concept of the global and long-range sequence-order effects of DNA sequence, the feature called “pseudo k-tuple nucleotide composition” (PseKNC) was used to encode the DNA sequences by incorporating six local structural properties of 16 dinucleotides. The overall success rate of 83.72% was achieved from the jackknife cross-validation test on an objective benchmark dataset. Comparisons demonstrate that the new predictor is superior to other methods. As a user-friendly web-server, iORI-PseKNC is freely accessible at http://lin.uestc.edu.cn/server/iORI-PseKNC. We hope that iORI-PseKNC will become a useful tool or at least as a complement to existing methods for identifying ORIs.

Related Topics
Physical Sciences and Engineering Chemistry Analytical Chemistry
Authors
, , , , ,