Search | VHL Regional Portal

1.

Calculation of the Weight of Evidence for Combined Single-Cell and Extracellular Forensic DNA.

Lun, Desmond S; Grgicak, Catherine M.

IEEE/ACM Trans Comput Biol Bioinform ; PP2024 Jun 19.

Article in English | MEDLINE | ID: mdl-38896524

ABSTRACT

The weight of DNA evidence for forensic applications is typically assessed through the calculation of the likelihood ratio (LR). In the standard workflow, DNA is extracted from a collection of cells where the cells of an unknown number of donors are mixed. The DNA is then genotyped, and the LR is calculated through well-established methods. Recently, a method for calculating the LR from single-cell data has been presented. Rather than extracting the DNA while the cells are still mixed, single-cell data is procured by first isolating each cell. Extraction and fragment analysis of relevant forensic loci follows such that individual cells are genotyped. This workflow leads to significantly stronger weights of evidence, but it does not account for extracellular DNA that could also be present in the sample. In this paper, we present a method for calculation of an LR that combines single-cell and extracellular data. We demonstrate the calculation on example data and show that the combined LR can lead to stronger conclusions than would be obtained from calculating LRs on the single-cell and extracellular DNA separately.

2.

Single-cell investigative genetics: Single-cell data produces genotype distributions concentrated at the true genotype across all mixture complexities.

Grgicak, Catherine M; Bhembe, Qhawe; Slooten, Klaas; Sheth, Nidhi C; Duffy, Ken R; Lun, Desmond S.

Forensic Sci Int Genet ; 69: 103000, 2024 03.

Article in English | MEDLINE | ID: mdl-38199167

ABSTRACT

In the absence of a suspect the forensic aim is investigative, and the focus is one of discerning what genotypes best explain the evidence. In traditional systems, the list of candidate genotypes may become vast if the sample contains DNA from many donors or the information from a minor contributor is swamped by that of major contributors, leading to lower evidential value for a true donor's contribution and, as a result, possibly overlooked or inefficient investigative leads. Recent developments in single-cell analysis offer a way forward, by producing data capable of discriminating genotypes. This is accomplished by first clustering single-cell data by similarity without reference to a known genotype. With good clustering it is reasonable to assume that the scEPGs in a cluster are of a single contributor. With that assumption we determine the probability of a cluster's content given each possible genotype at each locus, which is then used to determine the posterior probability mass distribution for all genotypes by application of Bayes' rule. A decision criterion is then applied such that the sum of the ranked probabilities of all genotypes falling in the set is at least 1-α. This is the credible genotype set and is used to inform database search criteria. Within this work we demonstrate the salience of single-cell analysis by performance testing a set of 630 previously constructed admixtures containing up to 5 donors of balanced and unbalanced contributions. We use scEPGs that were generated by isolating single cells, employing a direct-to-PCR extraction treatment, amplifying STRs that are compliant with existing national databases and applying post-PCR treatments that elicit a detection limit of one DNA copy. We determined that, for these test data, 99.3% of the true genotypes are included in the 99.8% credible set, regardless of the number of donors that comprised the mixture. We also determined that the most probable genotype was the true genotype for 97% of the loci when the number of cells in a cluster was at least two. Since efficient investigative leads will be borne by posterior mass distributions that are narrow and concentrated at the true genotype, we report that, for this test set, 47,900 (86%) loci returned only one credible genotype and of these 47,551 (99%) were the true genotype. When determining the LR for true contributors, 91% of the clusters rendered LR>1018, showing the potential of single-cell data to positively affect investigative reporting.

Subject(s)

DNA Fingerprinting , Microsatellite Repeats , Humans , DNA Fingerprinting/methods , Bayes Theorem , Genotype , DNA/genetics , Likelihood Functions

3.

Evidentiary evaluation of single cells renders highly informative forensic comparisons across multifarious admixtures.

Duffy, Ken R; Lun, Desmond S; Mulcahy, Madison M; O'Donnell, Leah; Sheth, Nidhi; Grgicak, Catherine M.

Forensic Sci Int Genet ; 64: 102852, 2023 05.

Article in English | MEDLINE | ID: mdl-36934551

ABSTRACT

The consistency between DNA evidence and person(s) of interest (PoI) is summarized by a likelihood ratio (LR): the probability of the data given the PoI contributed divided by the probability given they did not. It is often the case that there are several PoI who may have individually or jointly contributed to the stain. If there is more than one PoI, or the number of contributors (NoC) cannot easily be determined, then several sets of hypotheses are needed, requiring significant resources to complete the interpretation. Recent technological developments in laboratory systems offer a way forward, by enabling production of single cell data. Though single-cell data may be procured by next generation sequencing or capillary electrophoresis workflows, in this work we focus our attention on assessing the consistency between PoIs and a collection of single cell electropherograms (scEPGs) from diploid cells - i.e., leukocytes and epithelial cells. Specifically, we introduce a framework that: I) clusters scEPGs into collections, each originating from one genetic source; II) for each PoI, determines a LR for each cluster of scEPGs; and III) by averaging the likelihood ratios for each PoI across all clusters provides a whole-sample weight of evidence summary. By using Model Based Clustering (MBC) in step I) and an algorithm, named EESCIt for Evidentiary Evaluation of Single Cells, that computes single-cell LRs in step II), we show that 99% of the comparisons rendered log LR values > 0 for true contributors, and of these all but one gave log LR > 5, regardless of the number of donors or whether the smallest contributor donated less than 20% of the cells, greatly expanding the collection of cases for which DNA forensics provides informative results.

Subject(s)

DNA Fingerprinting , Microsatellite Repeats , Humans , Likelihood Functions , DNA Fingerprinting/methods , Algorithms , DNA/genetics

4.

Corrigendum to "The a posteriori probability of the number of contributors when conditioned on an assumed contributor" [Forensic Sci. Int. Genet. 54 (2021) 102563].

Grgicak, Catherine M; Duffy, Ken R; Lun, Desmond.

Forensic Sci Int Genet ; 58: 102666, 2022 May.

Article in English | MEDLINE | ID: mdl-35168909

5.

High-quality data from a forensically relevant single-cell pipeline enabled by low PBS and proteinase K concentrations.

Sheth, Nidhi; Duffy, Ken R; Grgicak, Catherine M.

J Forensic Sci ; 67(2): 697-706, 2022 Mar.

Article in English | MEDLINE | ID: mdl-34936089

ABSTRACT

Interpreting forensic DNA signal is arduous since the total intensity is a cacophony of signal from noise, artifact, and allele from an unknown number of contributors (NOC). An alternate to traditional bulk-processing pipelines is a single-cell one, where the sample is collected, and each cell is sequestered resulting in n single-source, single-cell EPGs (scEPG) that must be interpreted using applicable strategies. As with all forensic DNA interpretation strategies, high quality electropherograms are required; thus, to enhance the credibility of single-cell forensics, it is necessary to produce an efficient direct-to-PCR treatment that is compatible with prevailing downstream laboratory processes. We incorporated the semi-automated micro-fluidic DEPArray™ technology into the single-cell laboratory and optimized its implementation by testing the effects of four laboratory treatments on single-cell profiles. We focused on testing effects of phosphate buffer saline (PBS) since it is an important reagent that mitigates cell rupture but is also a PCR inhibitor. Specifically, we explored the effect of decreasing PBS concentrations on five electropherogram-quality metrics from 241 leukocytes: profile drop-out, allele drop-out, allele peak heights, peak height ratios, and scEPG sloping. In an effort to improve reagent use, we also assessed two concentrations of proteinase K. The results indicate that decreasing PBS concentrations to 0.5X or 0.25X improves scEPG quality, while modest modifications to proteinase K concentrations did not significantly impact it. We, therefore, conclude that a lower than recommended proteinase K concentration coupled with a lower than recommended PBS concentration results in enhanced scEPGs within the semi-automated single-cell pipeline.

Subject(s)

DNA Fingerprinting , DNA , Endopeptidase K , Alleles , DNA/analysis , DNA Fingerprinting/methods , Endopeptidase K/genetics , Forensic Genetics , Microsatellite Repeats , Polymerase Chain Reaction/methods

6.

A series of developmental validation tests for Number of Contributors platforms: Exemplars using NOCIt and a neural network.

Valtl, Jakob; Mönich, Ullrich J; Lun, Desmond S; Kelley, James; Grgicak, Catherine M.

Forensic Sci Int Genet ; 54: 102556, 2021 09.

Article in English | MEDLINE | ID: mdl-34225042

ABSTRACT

Complex DNA mixtures are challenging to interpret and require computational tools that aid in that interpretation. Recently, several computational methods that estimate the number of contributors (NOC) to a sample have been developed. Unlike analogous tools that interpret profiles and report LRs, NOC tools vary widely in their operational principle where some are Bayesian and others are machine learning tools. Conjunctionally, NOC tools may return a single n estimate, or a distribution on n. This vast array of constructs, coupled with a gap in standardized methods by which to validate NOC systems, warrants an exploration into the measures by which differing NOC systems might be tested for operations. In the current paper, we use two exemplar NOC systems: a probabilistic system named NOCIt, which renders an a posteriori probability (APP) distribution on the number of contributors given an electropherogram and an artificial neural network (ANN). NOCIt is a continuous Bayesian inference system incorporating models of peak height, degradation, differential degradation, forward and reverse stutter, noise and allelic drop-out while considering allele frequencies in a reference population. The ANN is also a continuous method, taking all the same features (barring degradation) into account. Unlike its Bayesian counterpart, it demands substantively more data to parameterize, requiring synthetic data. We explore each system's performance by conducting tests on 214 PROVEDIt mixtures where the limit of detection was 1-copy of DNA. We found that after a lengthy training period of approximately 24â¯h, the ANN's evaluation process was very fast and perfectly repeatable. In contrast, NOCIt only took a few minutes to train but took tens of minutes to complete each sample and was less repeatable. In addition, it rendered a probability distribution that was more sensitive and specific, affording a reasonable method by which to report all reasonable n that explain the evidence for a given sample. Whatever the method, by acknowledging the inherent differences between NOC systems, we demonstrate that validation constructs will necessarily be guided by the needs of the forensic domain and be dependent upon whether the laboratory seeks to assign a single n or range of n.

Subject(s)

DNA Fingerprinting , Microsatellite Repeats , Bayes Theorem , DNA/genetics , Humans , Neural Networks, Computer

7.

The a posteriori probability of the number of contributors when conditioned on an assumed contributor.

Grgicak, Catherine M; Duffy, Ken R; Lun, Desmond S.

Forensic Sci Int Genet ; 54: 102563, 2021 09.

Article in English | MEDLINE | ID: mdl-34284325

ABSTRACT

Forensic DNA signal is notoriously challenging to assess, requiring computational tools to support its interpretation. Over-expressions of stutter, allele drop-out, allele drop-in, degradation, differential degradation, and the like, make forensic DNA profiles too complicated to evaluate by manual methods. In response, computational tools that make point estimates on the Number of Contributors (NOC) to a sample have been developed, as have Bayesian methods that evaluate an A Posteriori Probability (APP) distribution on the NOC. In cases where an overly narrow NOC range is assumed, the downstream strength of evidence may be incomplete insofar as the evidence is evaluated with an inadequate set of propositions. In the current paper, we extend previous work on NOCIt, a Bayesian method that determines an APP on the NOC given an electropherogram, by reporting on an implementation where the user can add assumed contributors. NOCIt is a continuous system that incorporates models of peak height (including degradation and differential degradation), forward and reverse stutter, noise, and allelic drop-out, while being cognizant of allele frequencies in a reference population. When conditioned on a known contributor, we found that the mode of the APP distribution can shift to one greater when compared with the circumstance where no known contributor is assumed, and that occurred most often when the assumed contributor was the minor constituent to the mixture. In a development of a result of Slooten and Caliebe (FSI:G, 2018) that, under suitable assumptions, establishes the NOC can be treated as a nuisance variable in the computation of a likelihood ratio between the prosecution and defense hypotheses, we show that this computation must not only use coincident models, but also coincident contextual information. The results reported here, therefore, illustrate the power of modern probabilistic systems to assess full weights-of-evidence, and to provide information on reasonable NOC ranges across multiple contexts.

Subject(s)

DNA Fingerprinting , Alleles , Bayes Theorem , DNA , Humans

8.

Towards developing forensically relevant single-cell pipelines by incorporating direct-to-PCR extraction: compatibility, signal quality, and allele detection.

Sheth, Nidhi; Swaminathan, Harish; Gonzalez, Amanda J; Duffy, Ken R; Grgicak, Catherine M.

Int J Legal Med ; 135(3): 727-738, 2021 May.

Article in English | MEDLINE | ID: mdl-33484330

ABSTRACT

Current analysis of forensic DNA stains relies on the probabilistic interpretation of bulk-processed samples that represent mixed profiles consisting of an unknown number of potentially partial representations of each contributor. Single-cell methods, in contrast, offer a solution to the forensic DNA mixture problem by incorporating a step that separates cells before extraction. A forensically relevant single-cell pipeline relies on efficient direct-to-PCR extractions that are compatible with standard downstream forensic reagents. Here we demonstrate the feasibility of implementing single-cell pipelines into the forensic process by exploring four metrics of electropherogram (EPG) signal quality-i.e., allele detection rates, peak heights, peak height ratios, and peak height balance across low- to high-molecular-weight short tandem repeat (STR) markers-obtained with four direct-to-PCR extraction treatments and a common post-PCR laboratory procedure. Each treatment was used to extract DNA from 102 single buccal cells, whereupon the amplification reagents were immediately added to the tube and the DNA was amplified/injected using post-PCR conditions known to elicit a limit of detection (LoD) of one DNA molecule. The results show that most cells, regardless of extraction treatment, rendered EPGs with at least a 50% true positive allele detection rate and that allele drop-out was not cell independent. Statistical tests demonstrated that extraction treatments significantly impacted all metrics of EPG quality, where the Arcturus® PicoPure™ extraction method resulted in the lowest median allele drop-out rate, highest median average peak height, highest median average peak height ratio, and least negative median values of EPG sloping for GlobalFiler™ STR loci amplified at half volume. We, therefore, conclude the feasibility of implementing single-cell pipelines for casework purposes and demonstrate that inferential systems assuming cell independence will not be appropriate in the probabilistic interpretation of a collection of single-cell EPGs.

Subject(s)

Alleles , DNA Fingerprinting/methods , DNA/analysis , DNA/isolation & purification , Polymerase Chain Reaction/methods , Single-Cell Analysis , Electrophoresis, Capillary , Humans , Limit of Detection , Microsatellite Repeats , Mouth Mucosa

9.

A large-scale validation of NOCIt's a posteriori probability of the number of contributors and its integration into forensic interpretation pipelines.

Grgicak, Catherine M; Karkar, Slim; Yearwood-Garcia, Xia; Alfonse, Lauren E; Duffy, Ken R; Lun, Desmond S.

Forensic Sci Int Genet ; 47: 102296, 2020 07.

Article in English | MEDLINE | ID: mdl-32339916

ABSTRACT

Forensic DNA signal is notoriously challenging to interpret and requires the implementation of computational tools that support its interpretation. While data from high-copy, low-contributor samples result in electropherogram signal that is readily interpreted by probabilistic methods, electropherogram signal from forensic stains is often garnered from low-copy, high-contributor-number samples and is frequently obfuscated by allele sharing, allele drop-out, stutter and noise. Since forensic DNA profiles are too complicated to quantitatively assess by manual methods, continuous, probabilistic frameworks that draw inferences on the Number of Contributors (NOC) and compute the Likelihood Ratio (LR) given the prosecution's and defense's hypotheses have been developed. In the current paper, we validate a new version of the NOCIt inference platform that determines an A Posteriori Probability (APP) distribution of the number of contributors given an electropherogram. NOCIt is a continuous inference system that incorporates models of peak height (including degradation and differential degradation), forward and reverse stutter, noise and allelic drop-out while taking into account allele frequencies in a reference population. We established the algorithm's performance by conducting tests on samples that were representative of types often encountered in practice. In total, we tested NOCIt's performance on 815 degraded, UV-damaged, inhibited, differentially degraded, or uncompromised DNA mixture samples containing up to 5 contributors. We found that the model makes accurate, repeatable and reliable inferences about the NOCs and significantly outperformed methods that rely on signal filtering. By leveraging recent theoretical results of Slooten and Caliebe (FSI:G, 2018) that, under suitable assumptions, establish the NOC can be treated as a nuisance variable, we demonstrated that when NOCIt's APP is used in conjunction with a downstream likelihood ratio (LR) inference system that employs the same probabilistic model, a full evaluation across multiple contributor numbers is rendered. This work, therefore, illustrates the power of modern probabilistic systems to report holistic and interpretable weights-of-evidence to the trier-of-fact without assigning a specified number of contributors or filtering signal.

Subject(s)

DNA Fingerprinting , DNA/genetics , Likelihood Functions , Forensic Genetics/methods , Humans , Models, Statistical

10.

Statistical modeling of STR capillary electrophoresis signal.

Karkar, Slim; Alfonse, Lauren E; Grgicak, Catherine M; Lun, Desmond S.

BMC Bioinformatics ; 20(Suppl 16): 584, 2019 Dec 02.

Article in English | MEDLINE | ID: mdl-31787097

ABSTRACT

BACKGROUND: In order to isolate an individual's genotype from a sample of biological material, most laboratories use PCR and Capillary Electrophoresis (CE) to construct a genetic profile based on polymorphic loci known as Short Tandem Repeats (STRs). The resulting profile consists of CE signal which contains information about the length and number of STR units amplified. For samples collected from the environment, interpretation of the signal can be challenging given that information regarding the quality and quantity of the DNA is often limited. The signal can be further compounded by the presence of noise and PCR artifacts such as stutter which can mask or mimic biological alleles. Because manual interpretation methods cannot comprehensively account for such nuances, it would be valuable to develop a signal model that can effectively characterize the various components of STR signal independent of a priori knowledge of the quantity or quality of DNA. RESULTS: First, we seek to mathematically characterize the quality of the profile by measuring changes in the signal with respect to amplicon size. Next, we examine the noise, allele, and stutter components of the signal and develop distinct models for each. Using cross-validation and model selection, we identify a model that can be effectively utilized for downstream interpretation. Finally, we show an implementation of the model in NOCIt, a software system that calculates the a posteriori probability distribution on the number of contributors. CONCLUSION: The model was selected using a large, diverse set of DNA samples obtained from 144 different laboratory conditions; with DNA amounts ranging from a single copy of DNA to hundreds of copies, and the quality of the profiles ranging from pristine to highly degraded. Implemented in NOCIt, the model enables a probabilisitc approach to estimating the number of contributors to complex, environmental samples.

Subject(s)

Electrophoresis, Capillary/methods , Microsatellite Repeats/genetics , Models, Statistical , Alleles , DNA/genetics , Humans , Probability , Software

11.

Four model variants within a continuous forensic DNA mixture interpretation framework: Effects on evidential inference and reporting.

Swaminathan, Harish; Qureshi, Muhammad O; Grgicak, Catherine M; Duffy, Ken; Lun, Desmond S.

PLoS One ; 13(11): e0207599, 2018.

Article in English | MEDLINE | ID: mdl-30458020

ABSTRACT

Continuous mixture interpretation methods that employ probabilistic genotyping to compute the Likelihood Ratio (LR) utilize more information than threshold-based systems. The continuous interpretation schemes described in the literature, however, do not all use the same underlying probabilistic model and standards outlining which probabilistic models may or may not be implemented into casework do not exist; thus, it is the individual forensic laboratory or expert that decides which model and corresponding software program to implement. For countries, such as the United States, with an adversarial legal system, one can envision a scenario where two probabilistic models are used to present the weight of evidence, and two LRs are presented by two experts. Conversely, if no independent review of the evidence is requested, one expert using one model may present one LR as there is no standard or guideline requiring the uncertainty in the LR estimate be presented. The choice of model determines the underlying probability calculation, and changes to it can result in non-negligible differences in the reported LR or corresponding verbal categorization presented to the trier-of-fact. In this paper, we study the impact of model differences on the LR and on the corresponding verbal expression computed using four variants of a continuous mixture interpretation method. The four models were tested five times each on 101, 1-, 2- and 3-person experimental samples with known contributors. For each sample, LRs were computed using the known contributor as the person of interest. In all four models, intra-model variability increased with an increase in the number of contributors and with a decrease in the contributor's template mass. Inter-model variability in the associated verbal expression of the LR was observed in 32 of the 195 LRs used for comparison. Moreover, in 11 of these profiles there was a change from LR > 1 to LR < 1. These results indicate that modifications to existing continuous models do have the potential to significantly impact the final statistic, justifying the continuation of broad-based, large-scale, independent studies to quantify the limits of reliability and variability of existing forensically relevant systems.

Subject(s)

DNA Fingerprinting/methods , Forensic Genetics/methods , Algorithms , Humans , Likelihood Functions , Models, Statistical , Software , United States

12.

Determining the number of contributors to DNA mixtures in the low-template regime: Exploring the impacts of sampling and detection effects.

Norsworthy, Sarah; Lun, Desmond S; Grgicak, Catherine M.

Leg Med (Tokyo) ; 32: 1-8, 2018 May.

Article in English | MEDLINE | ID: mdl-29453054

ABSTRACT

The interpretation of DNA evidence may rely upon the assumption that the forensic short tandem repeat (STR) profile is composed of multiple genotypes, or partial genotypes, originating from n contributors. In cases where the number of contributors (NOC) is in dispute, it may be justifiable to compute likelihood ratios that utilize different NOC parameters in the numerator and denominator, or present different likelihoods separately. Therefore, in this work, we evaluate the impact of allele dropout on estimating the NOC for simulated mixtures with up to six contributors in the presence or absence of a major contributor. These simulations demonstrate that in the presence of dropout, or with the application of an analytical threshold (AT), estimating the NOC using counting methods was unreliable for mixtures containing one or more minor contributors present at low levels. The number of misidentifications was only slightly reduced when we expand the number of STR loci from 16 to 21. In many of the simulations tested herein, the minimum and actual NOC differed by more than two, suggesting that low-template, high-order mixtures with allele counts fewer than six may be originating from as many as four-, five-, or six-persons. Thus, there is justification for the use of differing or multiple assumptions on the NOC when computing the weight of DNA evidence for low-template mixtures, particularly when the peak heights are in the vicinity of the signal threshold or allele counting methods are the mechanism by which the NOC is assessed.

Subject(s)

Complex Mixtures/genetics , DNA Fingerprinting/methods , DNA/genetics , Forensic Genetics/methods , Algorithms , Alleles , Genotype , Humans , Likelihood Functions , Microsatellite Repeats , Specimen Handling

13.

A large-scale dataset of single and mixed-source short tandem repeat profiles to inform human identification strategies: PROVEDIt.

Alfonse, Lauren E; Garrett, Amanda D; Lun, Desmond S; Duffy, Ken R; Grgicak, Catherine M.

Forensic Sci Int Genet ; 32: 62-70, 2018 01.

Article in English | MEDLINE | ID: mdl-29091906

ABSTRACT

DNA-based human identity testing is conducted by comparison of PCR-amplified polymorphic Short Tandem Repeat (STR) motifs from a known source with the STR profiles obtained from uncertain sources. Samples such as those found at crime scenes often result in signal that is a composite of incomplete STR profiles from an unknown number of unknown contributors, making interpretation an arduous task. To facilitate advancement in STR interpretation challenges we provide over 25,000 multiplex STR profiles produced from one to five known individuals at target levels ranging from one to 160 copies of DNA. The data, generated under 144 laboratory conditions, are classified by total copy number and contributor proportions. For the 70% of samples that were synthetically compromised, we report the level of DNA damage using quantitative and end-point PCR. In addition, we characterize the complexity of the signal by exploring the number of detected alleles in each profile.

Subject(s)

DNA Fingerprinting , Datasets as Topic , Microsatellite Repeats , Alleles , DNA Damage , Forensic Genetics , Genotype , Humans , Polymerase Chain Reaction

14.

Production of high-fidelity electropherograms results in improved and consistent DNA interpretation: Standardizing the forensic validation process.

Peters, Kelsey C; Swaminathan, Harish; Sheehan, Jennifer; Duffy, Ken R; Lun, Desmond S; Grgicak, Catherine M.

Forensic Sci Int Genet ; 31: 160-170, 2017 11.

Article in English | MEDLINE | ID: mdl-28950155

ABSTRACT

Samples containing low-copy numbers of DNA are routinely encountered in casework. The signal acquired from these sample types can be difficult to interpret as they do not always contain all of the genotypic information from each contributor, where the loss of genetic information is associated with sampling and detection effects. The present work focuses on developing a validation scheme to aid in mitigating the effects of the latter. We establish a scheme designed to simultaneously improve signal resolution and detection rates without costly large-scale experimental validation studies by applying a combined simulation and experimental based approach. Specifically, we parameterize an in silico DNA pipeline with experimental data acquired from the laboratory and use this to evaluate multifarious scenarios in a cost-effective manner. Metrics such as signal1copy-to-noise resolution, false positive and false negative signal detection rates are used to select tenable laboratory parameters that result in high-fidelity signal in the single-copy regime. We demonstrate that the metrics acquired from simulation are consistent with experimental data obtained from two capillary electrophoresis platforms and various injection parameters. Once good resolution is obtained, analytical thresholds can be determined using detection error tradeoff analysis, if necessary. Decreasing the limit of detection of the forensic process to one copy of DNA is a powerful mechanism by which to increase the information content on minor components of a mixture, which is particularly important for probabilistic system inference. If the forensic pipeline is engineered such that high-fidelity electropherogram signal is obtained, then the likelihood ratio (LR) of a true contributor increases and the probability that the LR of a randomly chosen person is greater than one decreases. This is, potentially, the first step towards standardization of the analytical pipeline across operational laboratories.

Subject(s)

DNA Fingerprinting/standards , Electrophoresis, Capillary , Humans , Likelihood Functions , Limit of Detection , Microsatellite Repeats , Monte Carlo Method , Reproducibility of Results

15.

Exploring STR signal in the single- and multicopy number regimes: Deductions from an in silico model of the entire DNA laboratory process.

Duffy, Ken R; Gurram, Neil; Peters, Kelsey C; Wellner, Genevieve; Grgicak, Catherine M.

Electrophoresis ; 38(6): 855-868, 2017 03.

Article in English | MEDLINE | ID: mdl-27981603

ABSTRACT

Short tandem repeat (STR) profiling from DNA samples has long been the bedrock of human identification. The laboratory process is composed of multiple procedures that include quantification, sample dilution, PCR, electrophoresis, and fragment analysis. The end product is a short tandem repeat electropherogram comprised of signal from allele, artifacts, and instrument noise. In order to optimize or alter laboratory protocols, a large number of validation samples must be created at significant expense. As a tool to support that process and to enable the exploration of complex scenarios without costly sample creation, a mechanistic stochastic model that incorporates each of the aforementioned processing features is described herein. The model allows rapid in silico simulation of electropherograms from multicontributor samples and enables detailed investigations of involved scenarios. An implementation of the model that is parameterized by extensive laboratory data is publically available. To illustrate its utility, the model was employed in order to evaluate the effects of sample dilutions, injection time, and cycle number on peak height, and the nature of stutter ratios at low template. We verify the model's findings by comparison with experimentally generated data.

Subject(s)

Computer Simulation , DNA Copy Number Variations , DNA/analysis , Electrophoresis, Capillary/methods , Polymerase Chain Reaction/methods , Alleles , DNA Fingerprinting , Humans , Microsatellite Repeats , Sensitivity and Specificity

16.

Inferring the Number of Contributors to Complex DNA Mixtures Using Three Methods: Exploring the Limits of Low-Template DNA Interpretation.

Alfonse, Lauren E; Tejada, Genesis; Swaminathan, Harish; Lun, Desmond S; Grgicak, Catherine M.

J Forensic Sci ; 62(2): 308-316, 2017 Mar.

Article in English | MEDLINE | ID: mdl-27907229

ABSTRACT

In forensic DNA casework, the interpretation of an evidentiary profile may be dependent upon the assumption on the number of individuals from whom the evidence arose. Three methods of inferring the number of contributors-NOCIt, maximum likelihood estimator, and maximum allele count, were evaluated using 100 test samples consisting of one to five contributors and 0.5-0.016 ng template DNA amplified with Identifiler® Plus and PowerPlex® 16 HS. Results indicate that NOCIt was the most accurate method of the three, requiring 0.07 ng template DNA from any one contributor to consistently estimate the true number of contributors. Additionally, NOCIt returned repeatable results for 91% of samples analyzed in quintuplicate, while 50 single-source standards proved sufficient to calibrate the software. The data indicate that computational methods that employ a quantitative, probabilistic approach provide improved accuracy and additional pertinent information such as the uncertainty associated with the inferred number of contributors.

Subject(s)

DNA Fingerprinting/methods , DNA/genetics , Alleles , DNA/analysis , Gene Frequency , Humans , Likelihood Functions , Microsatellite Repeats , Monte Carlo Method , Polymerase Chain Reaction , Reproducibility of Results

17.

CEESIt: A computational tool for the interpretation of STR mixtures.

Swaminathan, Harish; Garg, Abhishek; Grgicak, Catherine M; Medard, Muriel; Lun, Desmond S.

Forensic Sci Int Genet ; 22: 149-160, 2016 May.

Article in English | MEDLINE | ID: mdl-26946255

ABSTRACT

In forensic DNA interpretation, the likelihood ratio (LR) is often used to convey the strength of a match. Expanding on binary and semi-continuous methods that do not use all of the quantitative data contained in an electropherogram, fully continuous methods to calculate the LR have been created. These fully continuous methods utilize all of the information captured in the electropherogram, including the peak heights. Recently, methods that calculate the distribution of the LR using semi-continuous methods have also been developed. The LR distribution has been proposed as a way of studying the robustness of the LR, which varies depending on the probabilistic model used for its calculation. For example, the LR distribution can be used to calculate the p-value, which is the probability that a randomly chosen individual results in a LR greater than the LR obtained from the person-of-interest (POI). Hence, the p-value is a statistic that is different from, but related to, the LR; and it may be interpreted as the false positive rate resulting from a binary hypothesis test between the prosecution and defense hypotheses. Here, we present CEESIt, a method that combines the twin features of a fully continuous model to calculate the LR and its distribution, conditioned on the defense hypothesis, along with an associated p-value. CEESIt incorporates dropout, noise and stutter (reverse and forward) in its calculation. As calibration data, CEESIt uses single source samples with known genotypes and calculates a LR for a specified POI on a question sample, along with the LR distribution and a p-value. The method was tested on 303 files representing 1-, 2- and 3-person samples injected using three injection times containing between 0.016 and 1 ng of template DNA. Our data allows us to evaluate changes in the LR and p-value with respect to the complexity of the sample and to facilitate discussions regarding complex DNA mixture interpretation. We observed that the amount of template DNA from the contributor impacted the LR--small LRs resulted from contributors with low template masses. Moreover, as expected, we observed a decrease of p-values as the LR increased. A p-value of 10(-9) or lower was achieved in all the cases where the LR was greater than 10(8). We tested the repeatability of CEESIt by running all samples in duplicate and found the results to be repeatable.

Subject(s)

Complex Mixtures/analysis , Complex Mixtures/genetics , DNA Fingerprinting/methods , DNA/analysis , DNA/genetics , Microsatellite Repeats , Genotype , Humans , Likelihood Functions , Models, Genetic , Models, Statistical

18.

Exploring the Impacts of Ordinary Laboratory Alterations During Forensic DNA Processing on Peak Height Variation, Thresholds, and Probability of Dropout.

Rowan, Kayleigh E; Wellner, Genevieve A; Grgicak, Catherine M.

J Forensic Sci ; 61(1): 177-85, 2016 Jan.

Article in English | MEDLINE | ID: mdl-26280243

ABSTRACT

Impacts of validation design on DNA signal were explored, and the level of variation introduced by injection, capillary changes, amplification, and kit lot was surveyed by examining a set of replicate samples ranging in mass from 0.25 to 0.008 ng. The variations in peak height, heterozygous balance, dropout probabilities, and baseline noise were compared using common statistical techniques. Data indicate that amplification is the source of the majority of the variation observed in the peak heights, followed by capillary lots. The use of different amplification kit lots did not introduce variability into the peak heights, heterozygous balance, dropout, or baseline. Thus, if data from case samples run over a significant time period are not available during validation, the validation must be designed to, at a minimum, include the amplification of multiple samples of varying quantity, with known genotype, amplified and run over an extended period of time using multiple pipettes and capillaries.

Subject(s)

DNA/genetics , Specimen Handling/methods , DNA Fingerprinting , Humans , Reproducibility of Results , Sequence Analysis, DNA/methods

19.

NOCIt: a computational method to infer the number of contributors to DNA samples analyzed by STR genotyping.

Swaminathan, Harish; Grgicak, Catherine M; Medard, Muriel; Lun, Desmond S.

Forensic Sci Int Genet ; 16: 172-180, 2015 May.

Article in English | MEDLINE | ID: mdl-25625964

ABSTRACT

Repetitive sequences in the human genome called short tandem repeats (STRs) are used in human identification for forensic purposes. Interpretation of DNA profiles generated using STRs is often problematic because of uncertainty in the number of contributors to the sample. Existing methods to identify the number of contributors work on the number of peaks observed and/or allele frequencies. We have developed a computational method called NOCIt that calculates the a posteriori probability (APP) on the number of contributors. NOCIt works on single source calibration data consisting of known genotypes to compute the APP for an unknown sample. The method takes into account signal peak heights, population allele frequencies, allele dropout and stutter-a commonly occurring PCR artifact. We tested the performance of NOCIt using 278 experimental and 40 simulated DNA mixtures consisting of one to five contributors with total DNA mass from 0.016 to 0.25ng. NOCIt correctly identified the number of contributors in 83% of the experimental samples and in 85% of the simulated mixtures, while the accuracy of the best pre-existing method to determine the number of contributors was 72% for the experimental samples and 73% for the simulated mixtures. Moreover, NOCIt calculated the APP for the true number of contributors to be at least 1% in 95% of the experimental samples and in all the simulated mixtures.

Subject(s)

Algorithms , Genotype , Microsatellite Repeats/genetics , Humans , Uncertainty

20.

Screening biological stains with qPCR versus lateral flow immunochromatographic test strips: a quantitative comparison using analytical figures of merit.

Oechsle, Crystal Simson; Haddad, Sandra; Sgueglia, Joanne B; Grgicak, Catherine M.

J Forensic Sci ; 59(1): 199-207, 2014 Jan.

Article in English | MEDLINE | ID: mdl-24117798

ABSTRACT

Biological fluid identification is an important facet of evidence examination in forensic laboratories worldwide. While identifying bodily fluids may provide insight into which downstream DNA methods to employ, these screening techniques consume a vital portion of the available evidence, are usually qualitative, and rely on visual interpretation. In contrast, qPCR yields information regarding the amount and proportion of amplifiable genetic material. In this study, dilution series of either semen or male saliva were prepared in either buffer or female blood. The samples were subjected to both lateral flow immunochromatographic test strips and qPCR analysis. Analytical figures of merit-including sensitivity, minimum distinguishable signal (MDS) and limit of detection (LOD)-were calculated and compared between methods. By applying the theory of the propagation of random errors, LODs were determined to be 0.05 µL of saliva for the RSID™ Saliva cards, 0.03 µL of saliva for Quantifiler(®) Duo, and 0.001 µL of semen for Quantifiler(®) Duo. In conclusion, quantitative PCR was deemed a viable and effective screening method for subsequent DNA profiling due to its stability in different matrices, sensitivity, and low limits of detection.

Subject(s)

Chromatography, Affinity/instrumentation , DNA Fingerprinting/methods , Real-Time Polymerase Chain Reaction , Blood Chemical Analysis , Female , Humans , Limit of Detection , Male , Microscopy , Saliva/chemistry , Semen/chemistry , Semen/cytology , Seminal Vesicle Secretory Proteins/analysis , Spermatozoa/cytology , alpha-Amylases/analysis

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL