Article ID Journal Published Year Pages File Type
4500241 Mathematical Biosciences 2012 13 Pages PDF
Abstract

In current computational biology, assigning a protein domain to a fold class is a complicated and controversial task. It can be more challenging in the much harder task of correct identification of protein domain fold pattern solely through using extracted information from protein sequence. To deal with such a challenging problem, the concepts of hyperfold and interlaced folds are introduced for the first time. Each hyperfold is a set of interlaced folds with a centroid fold. These concepts are used to construct a framework for handling the uncertainty involved with the fold classification problem. In this approach, an unknown query protein is assigned to a hyperfold rather than a single fold. Ten different sequence based features are used to predicting the correct hyperfold. This architecture is featured by the Dempster–Shafer theory of evidence through the bodies of evidence and Dempster’s rule of combination to combine the hyperfolds. The classification architecture thus developed was applied for identifying protein folds among the 27 famous SCOP fold patterns from a stringent well-known dataset. Compared with the existing predictors tested by the same benchmark dataset, our approach might achieve the better results.

Graphical abstractFigure optionsDownload full-size imageDownload as PowerPoint slideHighlights► Introducing hyperfold concepts to manage the uncertainty with fold classification. ► Using different sequence based features to make proper hyperfolds. ► Applying Dempster–Shafer theory to make bodies of evidence for each hyperfold. ► Utilizing Dempster’s rule of combination as a fusion tool to handling unspecificity.

Related Topics
Life Sciences Agricultural and Biological Sciences Agricultural and Biological Sciences (General)
Authors
, , , , ,