Article ID Journal Published Year Pages File Type
1240442 Spectrochimica Acta Part B: Atomic Spectroscopy 2012 9 Pages PDF
Abstract

We investigated five clustering and training set selection methods to improve the accuracy of quantitative chemical analysis of geologic samples by laser induced breakdown spectroscopy (LIBS) using partial least squares (PLS) regression. The LIBS spectra were previously acquired for 195 rock slabs and 31 pressed powder geostandards under 7 Torr CO2 at a stand-off distance of 7 m at 17 mJ per pulse to simulate the operational conditions of the ChemCam LIBS instrument on the Mars Science Laboratory Curiosity rover. The clustering and training set selection methods, which do not require prior knowledge of the chemical composition of the test-set samples, are based on grouping similar spectra and selecting appropriate training spectra for the partial least squares (PLS2) model. These methods were: (1) hierarchical clustering of the full set of training spectra and selection of a subset for use in training; (2) k-means clustering of all spectra and generation of PLS2 models based on the training samples within each cluster; (3) iterative use of PLS2 to predict sample composition and k-means clustering of the predicted compositions to subdivide the groups of spectra; (4) soft independent modeling of class analogy (SIMCA) classification of spectra, and generation of PLS2 models based on the training samples within each class; (5) use of Bayesian information criteria (BIC) to determine an optimal number of clusters and generation of PLS2 models based on the training samples within each cluster. The iterative method and the k-means method using 5 clusters showed the best performance, improving the absolute quadrature root mean squared error (RMSE) by ~ 3 wt.%. The statistical significance of these improvements was ~ 85%. Our results show that although clustering methods can modestly improve results, a large and diverse training set is the most reliable way to improve the accuracy of quantitative LIBS. In particular, additional sulfate standards and specifically fabricated analog samples with Mars-like compositions may improve the accuracy of ChemCam measurements on Mars. Refinement of the iterative method, modifications of the basic k-means clustering algorithm, and classification based on specifically selected S, C and Si emission lines may also prove beneficial and merit further study.

► We used several methods for grouping samples prior to running PLS2 calculations. ► Results using 4 or 5 k-means clusters were moderately more accurate. ► Iterative use of PLS2 and clustering also was modestly beneficial. ► Statistical significance of greatest improvements was ~ 85%. ► A diverse training set is the best way to improve quantitative LIBS accuracy.

Related Topics
Physical Sciences and Engineering Chemistry Analytical Chemistry
Authors
, , , , ,