کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4964748 1447929 2017 9 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Feature selection method based on support vector machine and shape analysis for high-throughput medical data
ترجمه فارسی عنوان
روش انتخاب ویژگی بر اساس ماشین بردار پشتیبانی و تجزیه و تحلیل شکل برای داده های پزشکی با کارایی بالا
کلمات کلیدی
داده های پزشکی با راندمان بالا، انتخاب ویژگی، ماشین بردار پشتیبانی، تجزیه و تحلیل شکل،
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
چکیده انگلیسی
Proteomics data analysis based on the mass-spectrometry technique can provide a powerful tool for early diagnosis of tumors and other diseases. It can be used for exploring the features that reflect the difference between samples from high-throughput mass spectrometry data, which are important for the identification of tumor markers. Proteomics mass spectrometry data have the characteristics of too few samples, too many features and noise interference, which pose a great challenge to traditional machine learning methods. Traditional unsupervised dimensionality reduction methods do not utilize the label information effectively, so the subspaces they find may not be the most separable ones of the data. To overcome the shortcomings of traditional methods, in this paper, we present a novel feature selection method based on support vector machine (SVM) and shape analysis. In the process of feature selection, our method considers not only the interaction between features but also the relationship between features and class labels, which improves the classification performance. The experimental results obtained from four groups of proteomics data show that, compared with traditional unsupervised feature extraction methods (i.e., Principal Component Analysis - Procrustes Analysis, PCA-PA), our method not only ensures that fewer features are selected but also ensures a high recognition rate. In addition, compared with the two kinds of multivariate filter methods, i.e., Max-Relevance Min-Redundancy (MRMR) and Fast Correlation-Based Filter (FCBF), our method has a higher recognition rate.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computers in Biology and Medicine - Volume 91, 1 December 2017, Pages 103-111
نویسندگان
, , ,