کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
1180513 | 1491536 | 2015 | 7 صفحه PDF | دانلود رایگان |
• A nonzero feature-retention strategy is proposed to decrease the dimensionality.
• A correlation-based filtering strategy is devised to improve the efficiency.
• A two-stage similarity measure scheme is designed to reduce the computation burden.
• The accuracy of the proposed method is competitive to the existing methods.
• The computation time of the proposed method is far less than the existing methods.
Similarity-measure-based spectrum matching is an effective approach to chemical compound identification. When the sizes of both the query library and the reference library become increasingly large, most existing spectrum-matching methods encounter a seriously heavy computation burden. In this paper, an effective and efficient compound-identification approach is proposed based on the frequency features of mass spectra. Considering the sparsity of mass spectra, a nonzero feature-selection strategy is proposed to decrease the feature dimensionality of mass spectra. To further improve its efficiency, a correlation-based filtering strategy is presented to select the most correlated reference spectra in order to create a reduced reference library. Based on the decreased features and the reduced reference library, the frequency-feature-based composite similarity measures are computed to estimate the chemical abstracts service (CAS) registry numbers of the mass spectra blue in a query library. Due to the reduction in both the feature dimensionality and the reference library, the computation time of the proposed method is only about 6%–11% of that of the existing methods, while the identification performance remains sufficiently competitive. Experimental results demonstrate the feasibility and efficiency of the proposed method.
Journal: Chemometrics and Intelligent Laboratory Systems - Volume 142, 15 March 2015, Pages 117–123