Analysis of protein sequences and their secondary structures based on transition matrices

Article ID	Journal	Published Year	Pages	File Type
5418094	Journal of Molecular Structure: THEOCHEM	2007	8 Pages	PDF

Abstract

Protein database is growing rapidly, but it is difficult to obtain information from protein sequences directly. Therefore, many kinds of methods have been proposed to analyze the protein sequences, the existing methods have their limitations to numerically characterize the protein sequences exactly. Here, we regard a protein sequence as a discrete-time Markov chain and construct transition matrices to numerically characterize it. Based on the properties of Markov chains, we predict the yesterday's, today's, and tomorrow's distributions of every amino acid, from which we can analyze the similarities of different species in the past or in the future. Meanwhile, we give a simple way to evaluate the methods for protein secondary structure prediction.

Keywords

biological sequences Protein secondary structure Similarity Transition matrix