Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Forensic Sci Int Genet ; 22: 149-160, 2016 May.
Artigo em Inglês | MEDLINE | ID: mdl-26946255

RESUMO

In forensic DNA interpretation, the likelihood ratio (LR) is often used to convey the strength of a match. Expanding on binary and semi-continuous methods that do not use all of the quantitative data contained in an electropherogram, fully continuous methods to calculate the LR have been created. These fully continuous methods utilize all of the information captured in the electropherogram, including the peak heights. Recently, methods that calculate the distribution of the LR using semi-continuous methods have also been developed. The LR distribution has been proposed as a way of studying the robustness of the LR, which varies depending on the probabilistic model used for its calculation. For example, the LR distribution can be used to calculate the p-value, which is the probability that a randomly chosen individual results in a LR greater than the LR obtained from the person-of-interest (POI). Hence, the p-value is a statistic that is different from, but related to, the LR; and it may be interpreted as the false positive rate resulting from a binary hypothesis test between the prosecution and defense hypotheses. Here, we present CEESIt, a method that combines the twin features of a fully continuous model to calculate the LR and its distribution, conditioned on the defense hypothesis, along with an associated p-value. CEESIt incorporates dropout, noise and stutter (reverse and forward) in its calculation. As calibration data, CEESIt uses single source samples with known genotypes and calculates a LR for a specified POI on a question sample, along with the LR distribution and a p-value. The method was tested on 303 files representing 1-, 2- and 3-person samples injected using three injection times containing between 0.016 and 1 ng of template DNA. Our data allows us to evaluate changes in the LR and p-value with respect to the complexity of the sample and to facilitate discussions regarding complex DNA mixture interpretation. We observed that the amount of template DNA from the contributor impacted the LR--small LRs resulted from contributors with low template masses. Moreover, as expected, we observed a decrease of p-values as the LR increased. A p-value of 10(-9) or lower was achieved in all the cases where the LR was greater than 10(8). We tested the repeatability of CEESIt by running all samples in duplicate and found the results to be repeatable.


Assuntos
Misturas Complexas/análise , Misturas Complexas/genética , Impressões Digitais de DNA/métodos , DNA/análise , DNA/genética , Repetições de Microssatélites , Genótipo , Humanos , Funções Verossimilhança , Modelos Genéticos , Modelos Estatísticos
2.
Forensic Sci Int Genet ; 19: 107-122, 2015 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-26218981

RESUMO

There are three dominant contributing factors that distort short tandem repeat profile measurements, two of which, stutter and variations in the allelic peak heights, have been described extensively. Here we characterise the remaining component, baseline noise. A probabilistic characterisation of the non-allelic noise peaks is not only inherently useful for statistical inference but is also significant for establishing a detection threshold. We do this by analysing the data from 643 single person profiles for the Identifiler Plus kit and 303 for the PowerPlex 16 HS kit. This investigation reveals that although the dye colour is a significant factor, it is not sufficient to have a per-dye colour description of the noise. Furthermore, we show that at a per-locus basis, out of the Gaussian, log-normal, and gamma distribution classes, baseline noise is best described by log-normal distributions and provide a methodology for setting an analytical threshold based on that deduction. In the PowerPlex 16 HS kit, we observe evidence of significant stutter at two repeat units shorter than the allelic peak, which has implications for the definition of baseline noise and signal interpretation. In general, the DNA input mass has an influence on the noise distribution. Thus, it is advisable to study noise and, consequently, to infer quantities like the analytical threshold from data with a DNA input mass comparable to the DNA input mass of the samples to be analysed.


Assuntos
Probabilidade , DNA/genética , Humanos , Funções Verossimilhança , Repetições de Microssatélites/genética
4.
Forensic Sci Int Genet ; 16: 172-180, 2015 May.
Artigo em Inglês | MEDLINE | ID: mdl-25625964

RESUMO

Repetitive sequences in the human genome called short tandem repeats (STRs) are used in human identification for forensic purposes. Interpretation of DNA profiles generated using STRs is often problematic because of uncertainty in the number of contributors to the sample. Existing methods to identify the number of contributors work on the number of peaks observed and/or allele frequencies. We have developed a computational method called NOCIt that calculates the a posteriori probability (APP) on the number of contributors. NOCIt works on single source calibration data consisting of known genotypes to compute the APP for an unknown sample. The method takes into account signal peak heights, population allele frequencies, allele dropout and stutter-a commonly occurring PCR artifact. We tested the performance of NOCIt using 278 experimental and 40 simulated DNA mixtures consisting of one to five contributors with total DNA mass from 0.016 to 0.25ng. NOCIt correctly identified the number of contributors in 83% of the experimental samples and in 85% of the simulated mixtures, while the accuracy of the best pre-existing method to determine the number of contributors was 72% for the experimental samples and 73% for the simulated mixtures. Moreover, NOCIt calculated the APP for the true number of contributors to be at least 1% in 95% of the experimental samples and in all the simulated mixtures.


Assuntos
Algoritmos , Genótipo , Repetições de Microssatélites/genética , Humanos , Incerteza
5.
Nat Biotechnol ; 31(8): 726-33, 2013 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-23851448

RESUMO

Recognizing direct relationships between variables connected in a network is a pervasive problem in biological, social and information sciences as correlation-based networks contain numerous indirect relationships. Here we present a general method for inferring direct effects from an observed correlation matrix containing both direct and indirect effects. We formulate the problem as the inverse of network convolution, and introduce an algorithm that removes the combined effect of all indirect paths of arbitrary length in a closed-form solution by exploiting eigen-decomposition and infinite-series sums. We demonstrate the effectiveness of our approach in several network applications: distinguishing direct targets in gene expression regulatory networks; recognizing directly interacting amino-acid residues for protein structure prediction from sequence alignments; and distinguishing strong collaborations in co-authorship social networks using connectivity information alone. In addition to its theoretical impact as a foundational graph theoretic tool, our results suggest network deconvolution is widely applicable for computing direct dependencies in network science across diverse disciplines.


Assuntos
Biologia Computacional , Redes Reguladoras de Genes , Modelos Estatísticos , Algoritmos , Simulação por Computador , Análise de Sequência com Séries de Oligonucleotídeos , Alinhamento de Sequência , Transdução de Sinais , Software
7.
Biochemistry ; 47(44): 11536-46, 2008 Nov 04.
Artigo em Inglês | MEDLINE | ID: mdl-18839965

RESUMO

ATP-dependent proteases are processive, meaning that they degrade full-length proteins into small peptide products without releasing large intermediates along the reaction pathway. In the case of the bacterial ATP-dependent protease ClpAP, ATP hydrolysis by the ClpA component has been proposed to be required for processive proteolysis of full-length protein substrates. We present here data showing that in the absence of the ATPase subunit ClpA, the protease subunit ClpP can degrade full-length protein substrates processively, albeit at a greatly reduced rate. Moreover, the size distribution of peptide products from a ClpP-catalyzed digest is remarkably similar to the size distribution of products from a ClpAP-catalyzed digest. The ClpAP- and ClpP-generated peptide product size distributions are fitted well by a sum of multiple underlying Gaussian peaks with means at integral multiples of approximately 900 Da (7-8 amino acids). Our results are consistent with a mechanism in which ClpP controls product sizes by alternating between translocation in steps of 7-8 (+/-2-3) amino acid residues and proteolysis. On the structural and molecular level, the step size may be controlled by the spacing between the ClpP active sites, and processivity may be achieved by coupling peptide bond hydrolysis to the binding and release of substrate and products in the protease chamber.


Assuntos
Endopeptidase Clp/metabolismo , Proteínas de Escherichia coli/metabolismo , Adenosina Trifosfatases/química , Adenosina Trifosfatases/metabolismo , Trifosfato de Adenosina/metabolismo , Domínio Catalítico , Endopeptidase Clp/química , Estabilidade Enzimática , Proteínas de Escherichia coli/química , Hidrólise , Modelos Biológicos , Modelos Moleculares , Peso Molecular , Peptídeos/química , Peptídeos/metabolismo , Estrutura Quaternária de Proteína , Subunidades Proteicas , Proteínas/química , Proteínas/metabolismo , Proteínas Recombinantes/química , Proteínas Recombinantes/metabolismo , Espectrometria de Massas por Ionização e Dessorção a Laser Assistida por Matriz , Especificidade por Substrato
8.
Sci Am ; 296(6): 78-85, 2007 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-17663228
9.
J Chem Inf Model ; 47(5): 1973-8, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-17628056

RESUMO

Assessment of the purity of chromatographic peaks is an important step in developing and validating purification procedures for complex mixtures. While curve-fitting techniques can be useful for determining the retention times and relative concentrations of the components of a chromatographic peak, their utility is limited by the lack of unambiguous criteria for determining the number of such components. In this work, we present a computational technique for analyzing chromatograms to estimate the number of components, their retention times, and their relative concentrations. In contrast to Fourier-transform-based techniques, the technique we present does not require manual peak identification. It is based on curve-fitting and uses the Akaike information criterion to estimate the number of components. Application of the technique to chromatograms obtained from size-exclusion and reverse-phase chromatography of test mixtures indicates that it is useful for the characterization of complex mixtures.


Assuntos
Cromatografia/métodos , Informática , Algoritmos , Soluções Tampão , Cromatografia em Gel , Análise de Fourier , Análise dos Mínimos Quadrados , Funções Verossimilhança , Modelos Estatísticos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...