Article ID Journal Published Year Pages File Type
7557504 Analytical Biochemistry 2016 27 Pages PDF
Abstract
N6-methyladenosine (m6A) is one of the most common and abundant post-transcriptional RNA modifications found in viruses and most eukaryotes. m6A plays an essential role in many vital biological processes to regulate gene expression. Because of its widespread distribution across the genomes, the identification of m6A sites from RNA sequences is of significant importance for better understanding the regulatory mechanism of m6A. Although progress has been achieved in m6A site prediction, challenges remain. This article aims to further improve the performance of m6A site prediction by introducing a new heuristic nucleotide physical-chemical property selection (HPCS) algorithm. The proposed HPCS algorithm can effectively extract an optimized subset of nucleotide physical-chemical properties under the prescribed feature representation for encoding an RNA sequence into a feature vector. We demonstrate the efficacy of the proposed HPCS algorithm under different feature representations, including pseudo dinucleotide composition (PseDNC), auto-covariance (AC), and cross-covariance (CC). Based on the proposed HPCS algorithm, we implemented an m6A site predictor, called M6A-HPCS, which is freely available at http://csbio.njust.edu.cn/bioinf/M6A-HPCS. Experimental results over rigorous jackknife tests on benchmark datasets demonstrated that the proposed M6A-HPCS achieves higher success rates and outperforms existing state-of-the-art sequence-based m6A site predictors.
Related Topics
Physical Sciences and Engineering Chemistry Analytical Chemistry
Authors
, , , , , ,