Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Electrophoresis ; 45(9-10): 877-884, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38196015

RESUMO

Macrohaplotype combines multiple types of phased DNA variants, increasing forensic discrimination power. High-quality long-sequencing reads, for example, PacBio HiFi reads, provide data to detect macrohaplotypes in multiploidy and DNA mixtures. However, the bioinformatics tools for detecting macrohaplotypes are lacking. In this study, we developed a bioinformatics software, MacroHapCaller, in which targeted loci (i.e., short TRs [STRs], single nucleotide polymorphisms, and insertion and deletions) are genotyped and combined with novel algorithms to call macrohaplotypes from long reads. MacroHapCaller uses physical phasing (i.e., read-backed phasing) to identify macrohaplotypes, and thus it can detect multi-allelic macrohaplotypes for a given sample. MacroHapCaller was validated with data generated from our designed targeted PacBio HiFi sequencing pipeline, which sequenced ∼8-kb amplicon regions harboring 20 core forensic STR loci in human benchmark samples HG002 and HG003. MacroHapCaller also was validated in whole-genome long-read sequencing data. Robust and accurate genotyping and phased macrohaplotypes were obtained with MacroHapCaller compared with the known ground truth. MacroHapCaller achieved a higher or consistent genotyping accuracy and faster speed than existing tools HipSTR and DeepVar. MacroHapCaller enables efficient macrohaplotype analysis from high-throughput sequencing data and supports applications using discriminating macrohaplotypes.


Assuntos
Haplótipos , Sequenciamento de Nucleotídeos em Larga Escala , Polimorfismo de Nucleotídeo Único , Poliploidia , Análise de Sequência de DNA , Software , Humanos , Análise de Sequência de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Algoritmos , Biologia Computacional/métodos , DNA/genética , DNA/análise , Repetições de Microssatélites/genética , Genética Forense/métodos , Técnicas de Genotipagem/métodos
2.
Forensic Sci Int Genet ; 63: 102807, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36462297

RESUMO

PCR artifacts are an ever-present challenge in sequencing applications. These artifacts can seriously limit the analysis and interpretation of low-template samples and mixtures, especially with respect to a minor contributor. In medicine, molecular barcoding techniques have been employed to decrease the impact of PCR error and to allow the examination of low-abundance somatic variation. In principle, it should be possible to apply the same techniques to the forensic analysis of mixtures. To that end, several short tandem repeat loci were selected for targeted sequencing, and a bioinformatic pipeline for analyzing the sequence data was developed. The pipeline notes the relevant unique molecular identifiers (UMIs) attached to each read and, using machine learning, filters the noise products out of the set of potential alleles. To evaluate this pipeline, DNA from pairs of individuals were mixed at different ratios (1-1, 1-9) and sequenced with different starting amounts of DNA (10, 1 and 0.1 ng). Naïvely using the information in the molecular barcodes led to increased performance, with the machine learning resulting in an additional benefit. In concrete terms, using the UMI data results in less noise for a given amount of drop out. For instance, if thresholds are selected that filter out a quarter of the true alleles, using read counts accepts 2381 noise alleles and using raw UMI counts accepts 1726 noise alleles, while the machine learning approach only accepts 307.


Assuntos
DNA , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Alelos , DNA/análise , Impressões Digitais de DNA/métodos , Análise de Sequência de DNA , Repetições de Microssatélites
3.
Forensic Sci Int Genet ; 55: 102568, 2021 11.
Artigo em Inglês | MEDLINE | ID: mdl-34416654

RESUMO

Short tandem repeats of the nuclear genome have been the preferred markers for analyzing forensic DNA mixtures. However, when nuclear DNA in a sample is degraded or limited, mitochondrial DNA (mtDNA) markers provide a powerful alternative. Though historically considered challenging, the interpretation and analysis of mtDNA mixtures have recently seen renewed interest with the advent of massively parallel sequencing. However, there are only a few software tools available for mtDNA mixture interpretation. To address this gap, the Mitochondrial Mixture Deconvolution and Interpretation Tool (MMDIT) was developed. MMDIT is an interactive application complete with a graphical user interface that allows users to deconvolve mtDNA (whole or partial genomes) mixtures into constituent donor haplotypes and estimate random match probabilities on these resultant haplotypes. In cases where deconvolution might not be feasible, the software allows mixture analysis directly within a binary framework (i.e. qualitatively, only using data on allele presence/absence). This paper explains the functionality of MMDIT, using an example of an in vitro two-person mtDNA mixture with a ratio of 1:4. The uniqueness of MMDIT lies in its ability to resolve mixtures into complete donor haplotypes using a statistical phasing framework before mixture analysis and evaluating statistical weights employing a novel graph algorithm approach. MMDIT is the first available open-source software that can automate mtDNA mixture deconvolution and analysis. The MMDIT web application can be accessed online at https://www.unthsc.edu/mmdit/. The source code is available at https://github.com/SammedMandape/MMDIT_UI and archived on zenodo (https://doi.org/10.5281/zenodo.4770184).


Assuntos
DNA Mitocondrial , Sequenciamento de Nucleotídeos em Larga Escala , DNA Mitocondrial/genética , Haplótipos , Humanos , Análise de Sequência de DNA , Software
4.
Genes (Basel) ; 12(2)2021 01 20.
Artigo em Inglês | MEDLINE | ID: mdl-33498312

RESUMO

Despite the benefits of quantitative data generated by massively parallel sequencing, resolving mitotypes from mixtures occurring in certain ratios remains challenging. In this study, a bioinformatic mixture deconvolution method centered on population-based phasing was developed and validated. The method was first tested on 270 in silico two-person mixtures varying in mixture proportions. An assortment of external reference panels containing information on haplotypic variation (from similar and different haplogroups) was leveraged to assess the effect of panel composition on phasing accuracy. Building on these simulations, mitochondrial genomes from the Human Mitochondrial DataBase were sourced to populate the panels and key parameter values were identified by deconvolving an additional 7290 in silico two-person mixtures. Finally, employing an optimized reference panel and phasing parameters, the approach was validated with in vitro two-person mixtures with differing proportions. Deconvolution was most accurate when the haplotypes in the mixture were similar to haplotypes present in the reference panel and when the mixture ratios were neither highly imbalanced nor subequal (e.g., 4:1). Overall, errors in haplotype estimation were largely bounded by the accuracy of the mixture's genotype results. The proposed framework is the first available approach that automates the reconstruction of complete individual mitotypes from mixtures, even in ratios that have traditionally been considered problematic.


Assuntos
DNA Mitocondrial , Genética Forense/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Modelos Estatísticos , Algoritmos , Teorema de Bayes , Biologia Computacional/métodos , Genoma Mitocondrial , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Polimorfismo de Nucleotídeo Único , Reprodutibilidade dos Testes , Análise de Sequência de DNA/métodos
5.
Forensic Sci Int Genet ; 51: 102459, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-33429137

RESUMO

Unique molecular identifiers (UMIs) are a promising approach to contend with errors generated during PCR and massively parallel sequencing (MPS). With UMI technology, random molecular barcodes are ligated to template DNA molecules prior to PCR, allowing PCR and sequencing error to be tracked and corrected bioinformatically. UMIs have the potential to be particularly informative for the interpretation of short tandem repeats (STRs). Traditional MPS approaches may simply lead to the observation of alleles that are consistent with the hypotheses of stutter, while with UMIs stutter products bioinformatically may be re-associated with their parental alleles and subsequently removed. Herein, a bioinformatics pipeline named strumi is described that is designed for the analysis of STRs that are tagged with UMIs. Unlike other tools, strumi is an alignment-free machine learning driven algorithm that clusters individual MPS reads into UMI families, infers consensus super-reads that represent each family and provides an estimate the resulting haplotype's accuracy. Super-reads, in turn, approximate independent measurements not of the PCR products, but of the original template molecules, both in terms of quantity and sequence identity. Provisional assessments show that naïve threshold-based approaches generate super-reads that are accurate (∼97 % haplotype accuracy, compared to ∼78 % when UMIs are not used), and the application of a more nuanced machine learning approach increases the accuracy to ∼99.5 % depending on the level of certainty desired. With these features, UMIs may greatly simplify probabilistic genotyping systems and reduce uncertainty. However, the ability to interpret alleles at trace levels also permits the interpretation, characterization and quantification of contamination as well as somatic variation (including somatic stutter), which may present newfound challenges.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Repetições de Microssatélites , Análise de Sequência de DNA/métodos , Impressões Digitais de DNA , Humanos
6.
Mitochondrion ; 55: 122-133, 2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-32949792

RESUMO

Nuclear mitochondrial DNA segments (NUMTs) are generated via transfer of portions of the mitochondrial genome into the nuclear genome. Given their common origin, there is the possibility that both the mitochondrial and NUMT segments may co-amplify using the same set of primers. Thus, analysis of the variation of the mitochondrial genome must take into account this co-amplification of mitochondrial and NUMT sequences. The study herein builds on data from the study by Strobl et al. (Strobl et al., 2019), in which multiple point heteroplasmies were called with an "N" to prevent labeling NUMT sequences mimicking mitochondrial heteroplasmy and being interpreted as true mitochondrial in origin sequence variants. Each of these point heteroplasmies was studied in greater detail, both molecularly and bioinformatically, to determine whether NUMT or true mitochondrial DNA variation was present. The bioinformatic and molecular tools available to help distinguish between NUMT and mitochondrial DNA and the effect of NUMT sequences on interpretation were discussed.


Assuntos
Núcleo Celular/genética , DNA Mitocondrial/classificação , Mitocôndrias/genética , Sequenciamento Completo do Genoma/métodos , Biologia Computacional/métodos , DNA Mitocondrial/isolamento & purificação , Variação Genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Filogenia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...