Article ID Journal Published Year Pages File Type
4500309 Mathematical Biosciences 2011 5 Pages PDF
Abstract

Identification of protein coding regions is fundamentally a statistical pattern recognition problem. Discriminant analysis is a statistical technique for classifying a set of observations into predefined classes and it is useful to solve such problems. It is well known that outliers are present in virtually every data set in any application domain, and classical discriminant analysis methods (including linear discriminant analysis (LDA) and quadratic discriminant analysis (QDA)) do not work well if the data set has outliers. In order to overcome the difficulty, the robust statistical method is used in this paper. We choose four different coding characters as discriminant variables and an approving result is presented by the method of robust discriminant analysis.

► The problem of identification of protein coding regions is considered by means of robust discriminant method. ► The accuracy of robust discriminant methods is better than that of codon usage method. ► The robust discriminant rules are better than the classical discriminant rules. ► Robust quadratic discriminant method is recommended when identifying protein coding regions of rice genes.

Keywords
Related Topics
Life Sciences Agricultural and Biological Sciences Agricultural and Biological Sciences (General)
Authors
, ,