Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 16 de 16
Filter
Add more filters










Publication year range
1.
Forensic Sci Int ; 357: 111994, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38522325

ABSTRACT

Likelihood ratios (LRs) are a useful measure of evidential strength. In forensic casework consisting of a flow of cases with essentially the same question and the same analysis method, it is feasible to construct an 'LR system', that is, an automated procedure that has the observations as input and an LR as output. This paper is aimed at practitioners interested in building their own LR systems. It gives an overview of the different steps needed to get to a validated LR system from data. The paper is accompanied by a notebook that illustrates each step with an example using glass data. The notebook introduces open-source software in Python constructed by NFI (Netherlands Forensic Institute) data scientists and statisticians.

2.
Forensic Sci Int ; 353: 111858, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37863005

ABSTRACT

An automated approach for evaluating the strength of the evidence of firearm toolmark comparison results is presented for a common source scenario. First, comparison scores are derived describing the similarity of marks typically encountered on the primer of fired cartridge cases: aperture shear striations as well as breechface and firing pin impressions. Subsequently, these scores are interpreted using reference distributions of comparison scores obtained for representative known matching (KM) and known non-matching (KNM) ballistic samples in a common source, score-based likelihood ratio (LR) system. We study various alternatives to set up such an LR system and compare them using qualitative and quantitative criteria known from the literature. As an example, results are applied to establish a system suitable for a firearm-ammunition combination often encountered in casework: Glock firearms with Fiocchi nickel primer ammunition. The system outputs an LR and a measure of LR uncertainty. The range of possible LR-values is limited to a minimum and maximum value in areas of the score domain with little reference data. Finally, the feasibility of combining LRs of different mark types into one LR for the entire primer is assessed. For the distribution models considered in this paper, different modeling approaches are optimal for different types of similarity scores. For the chosen firearm-ammunition combination, non-parametric Kernel Density Estimation (KDE) models perform best for similarity scores based on the correlation coefficient, whereas parametric models perform best for the Congruent Matching Cells (CMC) scores, assuming binomial and beta-binomial models for KM and KNM score distributions respectively. Finally, it is demonstrated that individual LRs of different mark types can be combined into one LR, to interpret a set of different marks on the primer as a whole.

3.
Forensic Sci Int Synerg ; 4: 100230, 2022.
Article in English | MEDLINE | ID: mdl-35647509

ABSTRACT

We agree wholeheartedly with Biedermann (2022) FSI Synergy article 100222 in its criticism of research publications that treat forensic inference in source attribution as an "identification" or "individualization" task. We disagree, however, with its criticism of the use of machine learning for forensic inference. The argument it makes is a strawman argument. There is a growing body of literature on the calculation of well-calibrated likelihood ratios using machine-learning methods and relevant data, and on the validation under casework conditions of such machine-learning-based systems.

4.
Forensic Sci Int ; 321: 110722, 2021 Apr.
Article in English | MEDLINE | ID: mdl-33684845

ABSTRACT

Numerical likelihood-ratio (LR) systems aim to calculate evidential strength for forensic evidence evaluation. Calibration of such LR-systems is essential: one does not want to over- or understate the strength of the evidence. Metrics that measure calibration differ in sensitivity to errors in calibration of such systems. In this paper we compare four calibration metrics by a simulation study based on Gaussian Log LR-distributions. Three calibration metrics are taken from the literature (Good, 1985; Royall, 1997; Ramos and Gonzalez-Rodriguez, 2013) [1-3], and a fourth metric is proposed by us. We evaluated these metrics by two performance criteria: differentiation (between well- and ill-calibrated LR-systems) and stability (of the value of the metric for a variety of well-calibrated LR-systems). Two metrics from the literature (the expected values of LR and of 1/LR, and the rate of misleading evidence stronger than 2) do not behave as desired in many simulated conditions. The third one (Cllrcal) performs better, but our newly proposed method (which we coin devPAV) is shown to behave equally well to clearly better under almost all simulated conditions. On the basis of this work, we recommend to use both devPAV and Cllrcal to measure calibration of LR-systems, where the current results indicate that devPAV is the preferred metric. In the future external validity of this comparison study can be extended by simulating non-Gaussian LR-distributions.

5.
Sci Rep ; 10(1): 20502, 2020 11 25.
Article in English | MEDLINE | ID: mdl-33239698

ABSTRACT

In arson cases, evidence such as DNA or fingerprints is often destroyed. One of the most important evidence modalities left is relating fire accelerants to a suspect. When gasoline is used as accelerant, the aim is to find a strong indication that a gasoline sample from a fire scene is related to a sample of a suspect. Gasoline samples from a fire scene are weathered, which prohibits a straightforward comparison. We combine machine learning, thermodynamic modeling, and quantum mechanics to predict the composition of unweathered gasoline samples starting from weathered ones. Our approach predicts the initial (unweathered) composition of the sixty main components in a weathered gasoline sample, with error bars of ca. 4% when weathered up to 80% w/w. This shows that machine learning is a valuable tool for predicting the initial composition of a weathered gasoline, and thereby relating samples to suspects.

6.
Forensic Sci Int ; 316: 110431, 2020 Nov.
Article in English | MEDLINE | ID: mdl-32980719

ABSTRACT

For evidence evaluation of the physicochemical properties of glass at activity level a well-known formula introduced by Evett & Buckleton [1,2] is commonly used. Parameters in this formula are, amongst others, the probability in a background population to find on somebody's clothing the observed number of glass sources and the probability in a background population to find on somebody's clothing a group of fragments with the same size as the observed matching group. Recently, for efficiency reasons, the Netherlands Forensic Institute changed its methodology to measure not all the glass fragments but a subset of glass fragments found on clothing. Due to the measurement of subsets, it is difficult to get accurate estimates for these parameters in this formula. We offer a solution to this problem. The heart of the solution consists of relaxing the assumption of conditional independence of group sizes of background fragments, and modelling the probability of an allocation of background fragments into groups given a total number of background fragments by a two-parameter Chinese restaurant process (CRP) [3]. Under the assumption of random sampling of fragments to be measured from recovered fragments in the laboratory, parameter values for the Chinese restaurant process may be estimated from a relatively small dataset of glass in other relevant cases. We demonstrate this for a dataset of glass fragments collected from upper garments in casework, show model fit and provide a prototypical calculation of an LR at activity level accompanied with a parameter sensitivity analysis for reasonable ranges of the CRP parameter values. Considering that other laboratories may want to measure subsets as well, we believe this is an important alternative approach to the evaluation of numerical LRs for glass analyses at activity level.

7.
Forensic Sci Int ; 314: 110388, 2020 Sep.
Article in English | MEDLINE | ID: mdl-32663721

ABSTRACT

In their paper "The evaluation of evidence for microspectrophotometry data using functional data analysis", in FSI 305, Aitken et al. present a likelihood-ratio (LR) system for their data. We show the values generated by this system cannot be interpreted as LRs: they are ill-calibrated and should be interpreted as discriminating scores. We demonstrate how to transform the scores to well-calibrated LRs using a post-hoc calibrating step. Also, we address criticisms of calibration posited by Aitken et al. We conclude by noting that ill-calibrated LR-values are misleadingly small or large. Therefore calibration should be measured and, if necessary, corrected for. The corrected LR-values (instead of the discriminating scores) can be used to update the prior odds in Bayes rule.

8.
Sci Justice ; 60(1): 20-29, 2020 01.
Article in English | MEDLINE | ID: mdl-31924285

ABSTRACT

Activity level evaluations, although still a major challenge for many disciplines, bring a wealth of possibilities for a more formal approach to the evaluation of interdisciplinary forensic evidence. This paper proposes a practical methodology for combining evidence from different disciplines within the likelihood ratio framework. Evidence schemes introduced in this paper make the process of combining evidence more insightful and intuitive thereby assisting experts in their interdisciplinairy evaluation and in explaining this process to the courts. When confronted with two opposing scenarios and multiple types of evidence, the likelihood ratio approach allows experts to combine this evidence in a probabilistic manner. Parts of the prosecution and defence scenarios for which forensic science is expected to be informative are identified. For these so called core elements, activity level propositions are formulated. Afterwards evidence schemes are introduced to assist the expert in combining the evidence in a logical manner. Two types of evidence relations are identified: serial and parallel evidence. Practical guidelines are given on how to deal with both types of evidence relations when combining the evidence.


Subject(s)
Forensic Sciences , Models, Statistical , Expert Testimony/methods , Humans
9.
Sci Justice ; 57(3): 181-192, 2017 May.
Article in English | MEDLINE | ID: mdl-28454627

ABSTRACT

For the comparative analysis of glass fragments, a method using Laser Ablation Inductively Coupled Plasma Mass Spectrometry (LA-ICP-MS) is in use at the NFI, giving measurements of the concentration of 18 elements. An important question is how to evaluate the results as evidence that a glass sample originates from a known glass source or from an arbitrary different glass source. One approach is the use of matching criteria e.g. based on a t-test or overlap of confidence intervals. An important drawback of this method is the fact that the rarity of the glass composition is not taken into account. A similar match can have widely different evidential values. In addition the use of fixed matching criteria can give rise to a "fall off the cliff" effect. Small differences may result in a match or a non-match. In this work a likelihood ratio system is presented, largely based on the two-level model as proposed by Aitken and Lucy [1], and Aitken, Zadora and Lucy [2]. Results show that the output from the two-level model gives good discrimination between same and different source hypotheses, but a post-hoc calibration step is necessary to improve the accuracy of the likelihood ratios. Subsequently, the robustness and performance of the LR system are studied. Results indicate that the output of the LR system is robust to the sample properties of the dataset used for calibration. Furthermore, the empirical upper and lower bound method [3], designed to deal with extrapolation errors in the density models, results in minimum and maximum values of the LR outputted by the system of 3.1×10-3 and 3.4×104. Calibration of the system, as measured by empirical cross-entropy, shows good behavior over the complete prior range. Rates of misleading evidence are small: for same-source comparisons, 0.3% of LRs support a different-source hypothesis; for different-source comparisons, 0.2% supports a same-source hypothesis. The authors use the LR system in reporting of glass cases to support expert opinion in the interpretation of glass evidence for origin of source questions.

11.
J Forensic Sci ; 62(3): 626-640, 2017 May.
Article in English | MEDLINE | ID: mdl-28168685

ABSTRACT

In this article, the performance of a score-based likelihood ratio (LR) system for comparisons of fingerprints with fingermarks is studied. The system is based on an automated fingerprint identification system (AFIS) comparison algorithm and focuses on fingerprint comparisons where the fingermarks contain 6-11 minutiae. The hypotheses under consideration are evaluated at the level of the person, not the finger. The LRs are presented with bootstrap intervals indicating the sampling uncertainty involved. Several aspects of the performance are measured: leave-one-out cross-validation is applied, and rates of misleading evidence are studied in two ways. A simulation study is performed to study the coverage of the bootstrap intervals. The results indicate that the evidential strength for same source comparisons that do not meet the Dutch twelve-point standard may be substantial. The methods used can be generalized to measure the performance of score-based LR systems in other fields of forensic science.

12.
J Forensic Sci ; 62(4): 1007-1014, 2017 Jul.
Article in English | MEDLINE | ID: mdl-28032347

ABSTRACT

In this paper, a method is described to quantify estimations of the total amount of drugs in groups of seized items, including quantification of the precision. Previous work on this topic was based on the assumptions of normally distributed measurements and grouping of items with a common relative standard deviation. In practice, these assumptions are often violated, for example, for data with point masses at 0, or if certain items in a group have a very high standard deviation. The method described in this paper is based on work by Welch and Satterthwaite and does not assume constant relative standard deviations. Case examples are described for which the method is applied, and simulation studies are carried out for which both methods are applied. In the cases, both methods perform reasonably well. If the assumption of common relative standard deviations clearly does not apply, it is advised to use the method described.


Subject(s)
Drug Trafficking , Illicit Drugs , Models, Statistical , Weights and Measures , Clothing , Forensic Sciences/methods , Humans
13.
Sci Justice ; 56(6): 482-491, 2016 Dec.
Article in English | MEDLINE | ID: mdl-27914556

ABSTRACT

A recent trend in forensic science is the development of objective, automated systems for the comparison of trace and reference material that give as output numerical likelihood ratios (LRs). For well discriminating LR systems, often the probability of the evidence given one or the other hypothesis depends on the density from the tail of a probability distribution. The models for probability distributions are trained by data. Since there is no proof of the applicability of the models beyond the data range, LR systems are sensitive to extrapolation errors. Given the unknown behavior in the tail region one may define the problem as when to stop extrapolating. When applied to LR systems, this leads to limit values of the likelihood ratio (e.g. a minimum and a maximum value of the LR outputted by the LR system), depending on the sizes of the validation datasets used. The solution proposed in this paper to determine these limits is based on the normalized Bayes error-rate [1] in combination with the introduction of misleading LRs with increasing strength.

14.
Forensic Sci Int ; 250: 57-67, 2015 May.
Article in English | MEDLINE | ID: mdl-25828379

ABSTRACT

This paper aims to provide the first steps towards a numerical source level evaluation of fibre evidence. For that purpose, likelihood ratio equations are derived for four generic scenarios, in which the source frequency, the number of references and trace types investigated, and the number of matches vary. Previous experimental studies into the evaluation of fibre evidence are reviewed and we demonstrate how the results of these studies, as well as other data, can be used to evaluate the derived equations for the four scenarios. Evaluation is not straightforward and requires a number of assumptions. This is mainly because the relevant population under consideration in a specific case cannot be sufficiently evaluated. In addition, the subjective match-criterion in current forensic fibre examinations makes it impossible to implement a good evaluation of the within-variation of samples. As a result, the discrimination power, currently calculated for discrimination studies, is only valid for samples with negligible heterogeneity. We conclude that reporting a numerical evidential value for forensic fibre examinations is not yet feasible as the data are available for only a few types of fibres and cannot be used without several assumptions. We propose a number of developments that are required to improve the accuracy and numerical analysis.

15.
Forensic Sci Int Genet ; 14: 156-60, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25450786

ABSTRACT

Matching DNA profiles of an accused person and a crime scene trace are one of the most common forms of forensic evidence. A number of years ago the so-called 'DNA controversy' was concerned with how to quantify the value of such evidence. Given its importance, the lack of understanding of such a basic issue was quite surprising and concerning. Deriving the equation for the likelihood ratio of a DNA database match in a much more direct and simple way is the topic of this paper. As it is much easier to follow it is hoped that this derivation will contribute to the understanding.


Subject(s)
Databases, Genetic , Forensic Genetics , Humans , Likelihood Functions
16.
Psychiatry Res ; 200(2-3): 904-10, 2012 Dec 30.
Article in English | MEDLINE | ID: mdl-22884307

ABSTRACT

Self-report measures of psychological distress or psychopathology are widely used and can be easily implemented as psychiatric screening tools. Positive psychological constructs such as vitality/optimism and work functioning have scarcely been incorporated. We aimed to develop and validate a psychological distress instrument, including measures of vitality and work functioning. A patient sample with suspected depressive, anxiety, and somatoform disorders (N=242) and a reference sample of the general population (N=516) filled in the 48-item Symptom Questionnaire (SQ-48) plus a battery of observer-rated and self-report scales (MINI Plus, MADR, BAS, INH, BSI), using a web-based ROM programme. The resulting SQ-48 is multidimensional and includes the following nine subscales: Depression (MOOD, six items), Anxiety (ANXI, six items), Somatization (SOMA, seven items), Agoraphobia (AGOR, four items), Aggression (AGGR, four items), Cognitive problems (COGN, five items), Social Phobia (SOPH, five items), Work functioning (WORK, five items), and Vitality (VITA, six items). The results showed good internal consistency as well as good convergent and divergent validity. The SQ-48 is meant to be available in the public domain for Routine Outcome Monitoring (ROM) and can be used as a screening/ monitoring tool in clinical settings (psychiatric and non-psychiatric), as a benchmark tool, or for research purposes.


Subject(s)
Anxiety Disorders/diagnosis , Depressive Disorder/diagnosis , Somatoform Disorders/diagnosis , Adult , Anxiety/diagnosis , Depression/diagnosis , Female , Humans , Male , Middle Aged , Psychometrics , Self Report , Surveys and Questionnaires
SELECTION OF CITATIONS
SEARCH DETAIL
...