Article ID Journal Published Year Pages File Type
8406674 Biosystems 2018 26 Pages PDF
Abstract
Autonomous replication sequences (ARS) are essential for the replication of Saccharomyces cerevisiae genome. The content and context of ARS sites are distinct from other segments of the genome and these factors influence the conformation and thermodynamic profile of DNA that favor binding of the origin recognition complex proteins. Identification of ARS sites in the genome is a challenging task because of their organizational complexity and degeneracy present across the intergenic regions. We considered a few properties of DNA segments and divided them into multiple subsets (views) for computational prediction of ARS sequences. Our approach utilized these views for learning classification models in an ensemble manner and accordingly predictions were made. This approach maximized the prediction accuracy over the traditional way where all features are selected at once. Our study also revealed that major groove width and major groove depth are the most prominent properties that distinguished ARS from other segments of the genome. Our investigation also provides clue about the most suitable classifier for a given feature set, and this strategy may be useful for finding ARS in other closely related species.
Related Topics
Physical Sciences and Engineering Mathematics Modelling and Simulation
Authors
, , ,