Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add more filters










Database
Language
Publication year range
1.
Sci Justice ; 64(1): 9-18, 2024 Jan.
Article in English | MEDLINE | ID: mdl-38182317

ABSTRACT

In recent years, numerous studies have examined the chemical compounds of petrol and petrol data for forensic research. Standard quantitative methods often assume that the variables or compounds do not have compositional constraints or are not part of a constrained whole, operating within an Euclidean vector space. However, chemical compounds are typically part of a whole, and the appropriate vector space for their analysis is the simplex. Biased and arbitrary results result when statistical analysis are applied on such data without proper pre-processing of such data. Compositional analysis of data has not yet been considered in forensic science. Therefore, we compare classical statistical analysis as applied in forensic research and the new proposed paradigm of compositional data analysis (CoDa). It is demonstrated how such analysis improves the analysis in petrol and forensic science. Our study shows how principal component analysis (PCA) and classification results are affected by the preprocessing steps performed on the raw data. Our results indicate that results from a log ratio analysis provides a better separation between subgroups of the data and leads to an easier interpretation of the results. In addition, with a compositional analysis a higher classification accuracy is obtained. Even a non-linear classification method - in our case a random forest - was shown to perform poorly when applied without using compositional methods. Moreover, normalization of samples due to laboratory/unit-of-measurement effects is no longer necessary, since the composition of an observation is in compositional thinking equivalent to a multiple of it, because the used (log) ratios on raw and log ratio transformed data are equal. Petrol data from different petrol stations in Brazil are used for the demonstration. This data is highly susceptible to counterfeit petrol. Forensic analysis of its chemical elements requires non-biased statistical analysis designed for compositional data to detect fraud. Based on these results, we recommend the use of compositional data methods for gasoline and petrol chemical element analysis and gasoline product characterization, authentication and fraud detection in forensic sciences.

2.
J Appl Stat ; 47(7): 1144-1167, 2020.
Article in English | MEDLINE | ID: mdl-35707025

ABSTRACT

Outlier detection can be seen as a pre-processing step for locating data points in a data sample, which do not conform to the majority of observations. Various techniques and methods for outlier detection can be found in the literature dealing with different types of data. However, many data sets are inflated by true zeros and, in addition, some components/variables might be of compositional nature. Important examples of such data sets are the Structural Earnings Survey, the Structural Business Statistics, the European Statistics on Income and Living Conditions, tax data or - as in this contribution - household expenditure data which are used, for example, to estimate the Purchase Power Parity of a country. In this work, robust univariate and multivariate outlier detection methods are compared by a complex simulation study that considers various challenges included in data sets, namely structural (true) zeros, missing values, and compositional variables. These circumstances make it difficult or impossible to flag true outliers and influential observations by well-known outlier detection methods. Our aim is to assess the performance of outlier detection methods in terms of their effectiveness to identify outliers when applied to challenging data sets such as the household expenditures data surveyed all over the world. Moreover, different methods are evaluated through a close-to-reality simulation study. Differences in performance of univariate and multivariate robust techniques for outlier detection and their shortcomings are reported. We found that robust multivariate methods outperform robust univariate methods. The best performing methods in finding the outliers and in providing a low false discovery rate were found to be the generalized S estimators (GSE), the BACON-EEM algorithm and a compositional method (CoDa-Cov). In addition, these methods performed also best when the outliers are imputed based on the corresponding outlier detection method and indicators are estimated from the data sets.

3.
Geburtshilfe Frauenheilkd ; 76(10): 1086-1091, 2016 Oct.
Article in English | MEDLINE | ID: mdl-27761030

ABSTRACT

Introduction: Diagnosis and treatment of vaginal and cervical cytological cell changes are described in European and national guidelines. The aim of this data collection was to evaluate the remission rates of PAP III and PAP III D cytological findings in patients over a period of 3-4 months. Method: The current state of affairs in managing suspicious and cytological findings (PAP III, and III D) in gynecological practice was assessed in the context of a data collection survey. An evaluation over a period of 24 months was conducted on preventative measures, the occurrence and changes to normal/suspect/pathological findings and therapy management (for suspicious or pathological findings). Results: 307 female patients were included in the analysis. At the time of the survey 186 patients (60.6 %) had PAP III and 119 (38.8 %) had PAP III D findings. The spontaneous remission rate of untreated PAP III patients was 6 % and that of untreated PAP III D patients was 11 %. The remission rates of patients treated with a vaginal gel were 77 % for PAP III and 71 % for PAP III D. Conclusion: A new treatment option was used in gynecological practice on patients with PAP III and PAP III D findings between confirmation and the next follow-up with excellent success.

SELECTION OF CITATIONS
SEARCH DETAIL
...