کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
1993241 1541243 2015 15 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A novel mixed integer programming for multi-biomarker panel identification by distinguishing malignant from benign colorectal tumors
ترجمه فارسی عنوان
برنامه ریزی عاملی ترکیبی جدید برای شناسایی پانل چندین بیومارکر با تشخیص بدخیم از تومورهای خوش خیم پروستات
کلمات کلیدی
بیوانفورماتیک ترجمهی، بیومارکر چندگانه، سرطان روده بزرگ، برنامه ریزی عدد صحیح مختلط
موضوعات مرتبط
علوم زیستی و بیوفناوری بیوشیمی، ژنتیک و زیست شناسی مولکولی زیست شیمی
چکیده انگلیسی

Multi-biomarker panels can capture the nonlinear synergy among biomarkers and they are important to aid in the early diagnosis and ultimately battle complex diseases. However, identification of these multi-biomarker panels from case and control data is challenging. For example, the exhaustive search method is computationally infeasible when the data dimension is high. Here, we propose a novel method, MILP_k, to identify serum-based multi-biomarker panel to distinguish colorectal cancers (CRC) from benign colorectal tumors. Specifically, the multi-biomarker panel detection problem is modeled by a mixed integer programming to maximize the classification accuracy. Then we measured the serum profiling data for 101 CRC patients and 95 benign patients. The 61 biomarkers were analyzed individually and further their combinations by our method.We discovered 4 biomarkers as the optimal small multi-biomarker panel, including known CRC biomarkers CEA and IL-10 as well as novel biomarkers IMA and NSE. This multi-biomarker panel obtains leave-one-out cross-validation (LOOCV) accuracy to 0.7857 by nearest centroid classifier. An independent test of this panel by support vector machine (SVM) with threefold cross validation gets an AUC 0.8438. This greatly improves the predictive accuracy by 20% over the single best biomarker. Further extension of this 4-biomarker panel to a larger 13-biomarker panel improves the LOOCV to 0.8673 with independent AUC 0.8437. Comparison with the exhaustive search method shows that our method dramatically reduces the searching time by 1000-fold. Experiments on the early cancer stage samples reveal two panel of biomarkers and show promising accuracy.The proposed method allows us to select the subset of biomarkers with best accuracy to distinguish case and control samples given the number of selected biomarkers. Both receiver operating characteristic curve and precision-recall curve show our method’s consistent performance gain in accuracy. Our method also shows its advantage in capturing synergy among selected biomarkers. The multi-biomarker panel far outperforms the simple combination of best single features. Close investigation of the multi-biomarker panel illustrates that our method possesses the ability to remove redundancy and reveals complementary biomarker combinations. In addition, our method is efficient and can select multi-biomarker panel with more than 5 biomarkers, for which the exhaustive methods fail.In conclusion, we propose a promising model to improve the clinical data interpretability and to serve as a useful tool for other complex disease studies. Our small multi-biomarker panel, CEA, IL-10, IMA, and NSE, may provide insights on the disease status of colorectal diseases.The implementation of our method in MATLAB is available via the website: http://doc.aporc.org/wiki/MILP_k.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Methods - Volume 83, 15 July 2015, Pages 3–17
نویسندگان
, , , , , ,