This online seminar focuses
on the ability of iC Quant™ to generate, select, and employ factors for three measurements: one measurement to predict concentrations and two additional measurements classify spectra.
iC Quant™ employs one measurement to predict concentrations and two additional measurements to classify spectra. Partial Least Squares (PLS) is used to predict chemical concentrations by decomposing a sample spectrum to its projections on a set of sub-spectral factors that are determined by analysis of a training set. The training set is composed of carefully chosen representative sample spectra. Whereas the spectral region used for the PLS model may contain tens to hundreds of variables, the PLS model will first reduce this dimensionality to perhaps five to twenty latent variables (which may be called factors). Of these perhaps two to ten will be retained for the model. The retained factors may be referred to as principal components. These factors will primarily describe the variance in the sample spectra that is associated with changes in chemical components of interest. Factors that are rejected are assumed to describe spectral variances that are associated with noise. Therefore, the PLS algorithm is capable of not only reducing the dimensionality of the analysis, but also rejects noise from the analysis, which in turn increases the accuracy of the measurement.
A carefully designed PLS model produces accurate predictions of concentrations of interest for the sample spectrum, so long as the spectrum being analyzed is representative of the training spectra set. If, however, a spectrum is an extreme or false sample (outlier) the analyst won’t know this explicitly from the PLS result. The Mahalanobis distance is a distance measure introduced by P. C. Mahalanobis in 19361. It is a useful way of determining the similarity of an unknown spectrum to a training set. The Mahalanobis distance uses the retained factors (principal components) to compare the unknown spectrum to the training set. A low value indicates that the spectrum is similar to the average spectrum in the training set, whereas a high value indicates an extreme sample. Because the criterion used for comparison encompasses the spectral changes associated with compositional differences (projection onto principal components), a sample with a high Mahalanobis distance generally has an extreme concentration of at least one chemical species. This is a useful measure that alerts the analyst to conditions such as an incorrect reagent charge.
The F-test, also referred to as the residual uses the factors that were calculated from the training set, but not retained for the model. These factors are expected to contain primarily noise. Therefore, a sample spectrum with a high residual is taken to imply differences with the training set that are random, i.e. not associated with variations in modeled chemical or physical properties. The residual alerts the analyst to non-normal samples that are perhaps due to the formation of an unexpected byproduct, or addition of the wrong reagent.
Reference
Mahalanobis, P C (1936). "On the generalised distance in statistics". Proceedings of the National Institute of Sciences of India 2 (1): 49–55.
Webinar Presenter
Your online presenter, Wes Walker, has over 10 year’s experience of working on FTIR applications using Process Analytical Technologies in laboratory, pilot plant, and plant environments. Wes graduated from the University of Wyoming and went on to gain his Ph.D. in Analytical Chemistry.
Related topics: iC Quant™, Quantitative Analysis. Chemometrics, Partial Least Squares, PLS Mahalanobis Distance, Residual, Calibration, Concentration Prediction, Classification, Spectra, reaction monitoring, FTIR, Process Analytical Technology, PAT
