Article ID Journal Published Year Pages File Type
1250226 Vibrational Spectroscopy 2015 6 Pages PDF
Abstract

Vibrational spectroscopy studies often generate datasets containing multiple spectra that are categorized into distinct groups according to similarity. Principal components analysis (PCA) is one of the most frequently used multivariate analysis methods for data reduction of vibrational spectra and visualization of potential groupings between subjects. Vibrational spectra usually display unimodal or multimodal distribution patterns of absorbance or transmittance across wavenumbers. PCA, requires that a linear relationship exists between data distributions of the objects under analysis otherwise the method is prone to a serious artifact known as the ‘horseshoe effect’. This artifact, well known in other fields of science, manifests as a serious distortion of the pattern of how objects group according to the most important principal components leading to misinterpretation of the relationships between the samples from which they are derived. In this paper, using a simulated mid-infrared spectral dataset, we investigate for the first time the potential for the PCA horseshoe effect on vibrational spectra and the why this artifact occurs. We show that when comparing large regions of contiguous wavenumbers between multiple spectra there can be a non-linear relationship between distributions of different spectra. Such non-linearity causes the horseshoe effect and we demonstrate that the degree of distortion of how spectra map on the first two components is related to the region size. We further show that reducing the size of spectra analyzed by PCA can minimize the horseshoe effect. We conclude that PCA should be used with caution in the analysis and interpretation of vibrational spectra and the application of more robust methods should be explored.

Related Topics
Physical Sciences and Engineering Chemistry Analytical Chemistry
Authors
, ,