RESUMO
Due to the high dimensionality and non-linearity of the near infrared (NIR) spectra data result the difficulty of the outlier measure. This paper proposed a probability based outlier detection method, which adopted the distribution probability of the spectra data to identify outliers at each wavelength by using of copula function. The negative logarithmic function was also used to quantify the overall variation of the joint distribution for the outliers. This method not only enlarges the difference of the spectra between typical samples and outliers, but also can be adapted to multi-type of outliers. Moreover, the jump degree in statistics was introduced for the automated determination of threshold for the outliers, which avoids the threshold setting problem in empirical way and the misjudgment of the outliers. In order to investigate the effectiveness of the algorithm, the recognition of different cases and types of outliers were applied, and compared with the commonly used PCA-Mahalanobis distance, spectral residual (SR) and leverage methods. The experimental results showed that the probability based outlier detection method effectively improved the performance of outlier identification and calibration for NIR analysis.