Article ID Journal Published Year Pages File Type
2833745 Molecular Phylogenetics and Evolution 2016 10 Pages PDF
Abstract

•A new alignment-free proteome based method for phylogenetic tree construction is proposed.•We convert the whole proteome sequences into transition matrices of a higher order Markov model.•One-dimensional CGR and the linked list are used to reduce the problem of large memory storage.•A distance measure based on the angle between two feature vectors is used to refer the phylogenetic distance.•Our results on two data sets demonstrate that the new method is useful and efficient.

Traditional methods for sequence comparison and phylogeny reconstruction rely on pair wise and multiple sequence alignments. But alignment could not be directly applied to whole genome/proteome comparison and phylogenomic studies due to their high computational complexity. Hence alignment-free methods became popular in recent years. Here we propose a fast alignment-free method for whole genome/proteome comparison and phylogeny reconstruction using higher order Markov model and chaos game representation. In the present method, we use the transition matrices of higher order Markov models to characterize amino acid or DNA sequences for their comparison. The order of the Markov model is uniquely identified by maximizing the average Shannon entropy of conditional probability distributions. Using one-dimensional chaos game representation and linked list, this method can reduce large memory and time consumption which is due to the large-scale conditional probability distributions. To illustrate the effectiveness of our method, we employ it for fast phylogeny reconstruction based on genome/proteome sequences of two species data sets used in previous published papers. Our results demonstrate that the present method is useful and efficient.Availability and implementation: The source codes for our algorithm to get the distance matrix and genome/proteome sequences can be downloaded from ftp://121.199.20.25/. The software Phylip and EvolView we used to construct phylogenetic trees can be referred from their websites.

Graphical abstractFigure optionsDownload full-size imageDownload as PowerPoint slide

Related Topics
Life Sciences Agricultural and Biological Sciences Ecology, Evolution, Behavior and Systematics
Authors
, , ,