Article ID Journal Published Year Pages File Type
15154 Computational Biology and Chemistry 2013 9 Pages PDF
Abstract

Protein inference is an important issue in proteomics research. Its main objective is to select a proper subset of candidate proteins that best explain the observed peptides. Although many methods have been proposed for solving this problem, several issues such as peptide degeneracy and one-hit wonders still remain unsolved. Therefore, the accurate identification of proteins that are truly present in the sample continues to be a challenging task.Based on the concept of peptide detectability, we formulate the protein inference problem as a constrained Lasso regression problem, which can be solved very efficiently through a coordinate descent procedure. The new inference algorithm is named as ProteinLasso, which explores an ensemble learning strategy to address the sparsity parameter selection problem in Lasso model. We test the performance of ProteinLasso on three datasets. As shown in the experimental results, ProteinLasso outperforms those state-of-the-art protein inference algorithms in terms of both identification accuracy and running efficiency. In addition, we show that ProteinLasso is stable under different parameter specifications. The source code of our algorithm is available at: http://sourceforge.net/projects/proteinlasso.

Graphical abstractFigure optionsDownload full-size imageDownload as PowerPoint slideHighlights► The accurate identification of proteins that are truly present in the sample continues to be a challenging task. ► We use a constrained Lasso regression model to formulate the protein inference problem. ► The model can be solved very efficiently through a coordinate descent algorithm. ► An ensemble learning strategy is adopted to address the parameter estimation problem.

Related Topics
Physical Sciences and Engineering Chemical Engineering Bioengineering
Authors
, , , ,