Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Nat Genet ; 55(5): 746-752, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-37038003

RESUMO

Phylogenetics has a crucial role in genomic epidemiology. Enabled by unparalleled volumes of genome sequence data generated to study and help contain the COVID-19 pandemic, phylogenetic analyses of SARS-CoV-2 genomes have shed light on the virus's origins, spread, and the emergence and reproductive success of new variants. However, most phylogenetic approaches, including maximum likelihood and Bayesian methods, cannot scale to the size of the datasets from the current pandemic. We present 'MAximum Parsimonious Likelihood Estimation' (MAPLE), an approach for likelihood-based phylogenetic analysis of epidemiological genomic datasets at unprecedented scales. MAPLE infers SARS-CoV-2 phylogenies more accurately than existing maximum likelihood approaches while running up to thousands of times faster, and requiring at least 100 times less memory on large datasets. This extends the reach of genomic epidemiology, allowing the continued use of accurate phylogenetic, phylogeographic and phylodynamic analyses on datasets of millions of genomes.


Assuntos
COVID-19 , Humanos , Filogenia , COVID-19/epidemiologia , COVID-19/genética , SARS-CoV-2/genética , Funções Verossimilhança , Pandemias , Teorema de Bayes
2.
bioRxiv ; 2022 Jul 18.
Artigo em Inglês | MEDLINE | ID: mdl-35350209

RESUMO

Phylogenetics plays a crucial role in the interpretation of genomic data1. Phylogenetic analyses of SARS-CoV-2 genomes have allowed the detailed study of the virus's origins2, of its international3,4 and local4-9 spread, and of the emergence10 and reproductive success11 of new variants, among many applications. These analyses have been enabled by the unparalleled volumes of genome sequence data generated and employed to study and help contain the pandemic12. However, preferred model-based phylogenetic approaches including maximum likelihood and Bayesian methods, mostly based on Felsenstein's 'pruning' algorithm13,14, cannot scale to the size of the datasets from the current pandemic4,15, hampering our understanding of the virus's evolution and transmission16. We present new approaches, based on reworking Felsenstein's algorithm, for likelihood-based phylogenetic analysis of epidemiological genomic datasets at unprecedented scales. We exploit near-certainty regarding ancestral genomes, and the similarities between closely related and densely sampled genomes, to greatly reduce computational demands for memory and time. Combined with new methods for searching amongst candidate evolutionary trees, this results in our MAPLE ('MAximum Parsimonious Likelihood Estimation') software giving better results than popular approaches such as FastTree 217, IQ-TREE 218, RAxML-NG19 and UShER15. Our approach therefore allows complex and accurate probabilistic phylogenetic analyses of millions of microbial genomes, extending the reach of genomic epidemiology. Future epidemiological datasets are likely to be even larger than those currently associated with COVID-19, and other disciplines such as metagenomics and biodiversity science are also generating huge numbers of genome sequences20-22. Our methods will permit continued use of preferred likelihood-based phylogenetic analyses.

4.
Genes (Basel) ; 9(7)2018 Jul 06.
Artigo em Inglês | MEDLINE | ID: mdl-29986475

RESUMO

Hepatitis C virus (HCV) causes a major health burden and can be effectively treated by direct-acting antivirals (DAAs). The non-structural protein 5A (NS5A), which plays a role in the viral genome replication, is one of the DAAs’ targets. Resistance-associated viruses (RAVs) harbouring NS5A resistance-associated mutations (RAMs) have been described at baseline and after therapy failure. A mutation from glutamine to arginine at position 30 (Q30R) is a characteristic RAM for the HCV sub/genotype (GT) 1a, but arginine corresponds to the wild type in the GT-1b; still, GT-1b strains are susceptible to NS5A-inhibitors. In this study, we show that GT-1b strains with R30Q often display other specific NS5A substitutions, particularly in positions 24 and 34. We demonstrate that in GT-1b secondary substitutions usually happen after initial R30Q development in the phylogeny, and that the chemical properties of the corresponding amino acids serve to restore the positive charge in this region, acting as compensatory mutations. These findings may have implications for RAVs treatment.

5.
Antivir Ther ; 23(6): 485-493, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29745936

RESUMO

BACKGROUND: HCV infections can now be completely cured, thanks to the currently marketed direct-acting antivirals (DAAs). It is known that HCV patients carry viral populations with baseline polymorphisms and/or mutations that make them resistant against some of these DAAs, which can negatively impact the patient's treatment outcome. Using complete HCV coding sequences isolated from 1,306 treatment-naive patients of genotypes (GTs) 1, 2, 3, 4 and 6 from around the globe, we studied the prevalence of baseline resistance-associated polymorphisms (RAPs) and resistance mutations (RMs) against DAAs that are currently on the market or in clinical trials. METHODS: The HCV genome sequences used in this study were retrieved from the NCBI database. RAPs and RMs, with reference to HCV GT1a, were identified using the HCV Geno2pheno web server. RESULTS: Nearly 50% of the total amino acid positions (including NS3 protease, NS5A and NS5B) studied are baseline polymorphisms that differentiated one GT from the rest. A proportion of these baseline polymorphisms and baseline non-polymorphic RMs could confer a significant increase in resistance against DAAs. CONCLUSIONS: In this study, we show the presence and prevalence of RAPs and RMs in DAA treatment-naive patients against currently used DAAs or DAAs in clinical trials. Our study suggests that RAPs and RMs profiling of HCV patients should be performed before the start of the therapy. Our results should be relevant especially in low- and middle-income countries, where the patients have a large variation of GTs and subtypes, and where the generic HCV treatment is now increasingly available.


Assuntos
Antivirais/uso terapêutico , Farmacorresistência Viral/genética , Genoma Viral , Hepacivirus/efeitos dos fármacos , Hepatite C Crônica/tratamento farmacológico , Polimorfismo Genético , Proteínas não Estruturais Virais/genética , África/epidemiologia , América/epidemiologia , Europa (Continente)/epidemiologia , Expressão Gênica , Genótipo , Hepacivirus/enzimologia , Hepacivirus/genética , Hepatite C Crônica/epidemiologia , Hepatite C Crônica/virologia , Humanos , Isoenzimas/genética , Mutação , Oceania/epidemiologia , Farmacogenética/métodos , Prevalência
6.
Nucleic Acids Res ; 46(W1): W271-W277, 2018 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-29718426

RESUMO

Identifying resistance to antiretroviral drugs is crucial for ensuring the successful treatment of patients infected with viruses such as human immunodeficiency virus (HIV) or hepatitis C virus (HCV). In contrast to Sanger sequencing, next-generation sequencing (NGS) can detect resistance mutations in minority populations. Thus, genotypic resistance testing based on NGS data can offer novel, treatment-relevant insights. Since existing web services for analyzing resistance in NGS samples are subject to long processing times and follow strictly rules-based approaches, we developed geno2pheno[ngs-freq], a web service for rapidly identifying drug resistance in HIV-1 and HCV samples. By relying on frequency files that provide the read counts of nucleotides or codons along a viral genome, the time-intensive step of processing raw NGS data is eliminated. Once a frequency file has been uploaded, consensus sequences are generated for a set of user-defined prevalence cutoffs, such that the constructed sequences contain only those nucleotides whose codon prevalence exceeds a given cutoff. After locally aligning the sequences to a set of references, resistance is predicted using the well-established approaches of geno2pheno[resistance] and geno2pheno[hcv]. geno2pheno[ngs-freq] can assist clinical decision making by enabling users to explore resistance in viral populations with different abundances and is freely available at http://ngs.geno2pheno.org.


Assuntos
Farmacorresistência Viral/genética , Infecções por HIV/tratamento farmacológico , HIV-1/genética , Software , Genoma Viral/genética , Genótipo , Infecções por HIV/genética , Infecções por HIV/virologia , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Mutação
7.
Sci Rep ; 7(1): 6371, 2017 07 25.
Artigo em Inglês | MEDLINE | ID: mdl-28744024

RESUMO

A temporal increase in non-B subtypes has earlier been described in Sweden by us and we hypothesized that this increased viral heterogeneity may become a hotspot for the development of more complex and unique recombinant forms (URFs) if the epidemics converge. In the present study, we performed subtyping using four automated tools and phylogenetic analysis by RAxML of pol gene sequences (n = 5246) and HIV-1 near full-length genome (HIV-NFLG) sequences (n = 104). A CD4+ T-cell decline trajectory algorithm was used to estimate time of HIV infection. Transmission clusters were identified using the family-joining method. The analysis of HIV-NFLG and pol gene described 10.6% (11/104) and 2.6% (137/5246) of the strains as URFs, respectively. An increasing trend of URFs was observed in recent years by both approaches (p = 0·0082; p < 0·0001). Transmission cluster analysis using the pol gene of all URFs identified 14 clusters with two to eight sequences. Larger transmission clusters of URFs (BF1 and 01B) were observed among MSM who mostly were sero-diagnosed in recent time. Understanding the increased appearance and transmission of URFs in recent years could have importance for public health interventions and the use of HIV-NFLG would provide better statistical support for such assessments.


Assuntos
Infecções por HIV/transmissão , HIV-1/classificação , Tipagem Molecular/métodos , Análise de Sequência de RNA/métodos , Produtos do Gene pol do Vírus da Imunodeficiência Humana/genética , Adulto , Algoritmos , Linfócitos T CD4-Positivos , Epidemias , Feminino , Técnicas de Genotipagem , Infecções por HIV/epidemiologia , Infecções por HIV/virologia , Soropositividade para HIV/epidemiologia , HIV-1/genética , Homossexualidade Masculina , Humanos , Masculino , Pessoa de Meia-Idade , Filogenia , Recombinação Genética , Suécia/epidemiologia
9.
Mol Biol Evol ; 33(10): 2720-34, 2016 10.
Artigo em Inglês | MEDLINE | ID: mdl-27436007

RESUMO

The widely used model for evolutionary relationships is a bifurcating tree with all taxa/observations placed at the leaves. This is not appropriate if the taxa have been densely sampled across evolutionary time and may be in a direct ancestral relationship, or if there is not enough information to fully resolve all the branching points in the evolutionary tree. In this article, we present a fast distance-based agglomeration method called family-joining (FJ) for constructing so-called generally labeled trees in which taxa may be placed at internal vertices and the tree may contain polytomies. FJ constructs such trees on the basis of pairwise distances and a distance threshold. We tested three methods for threshold selection, FJ-AIC, FJ-BIC, and FJ-CV, which minimize Akaike information criterion, Bayesian information criterion, and cross-validation error, respectively. When compared with related methods on simulated data, FJ-BIC was among the best at reconstructing the correct tree across a wide range of simulation scenarios. FJ-BIC was applied to HIV sequences sampled from individuals involved in a known transmission chain. The FJ-BIC tree was found to be compatible with almost all transmission events. On average, internal branches in the FJ-BIC tree have higher bootstrap support than branches in the leaf-labeled bifurcating tree constructed using RAxML. 36% and 25% of the internal branches in the FJ-BIC tree and RAxML tree, respectively, have bootstrap support greater than 70%. To the best of our knowledge the method presented here is the first attempt at modeling evolutionary relationships using generally labeled trees.


Assuntos
Algoritmos , Modelos Genéticos , Filogenia , Estatística como Assunto/métodos , Teorema de Bayes , Evolução Biológica , Simulação por Computador
10.
PLoS One ; 11(5): e0155869, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27196673

RESUMO

The face of hepatitis C virus (HCV) therapy is changing dramatically. Direct-acting antiviral agents (DAAs) specifically targeting HCV proteins have been developed and entered clinical practice in 2011. However, despite high sustained viral response (SVR) rates of more than 90%, a fraction of patients do not eliminate the virus and in these cases treatment failure has been associated with the selection of drug resistance mutations (RAMs). RAMs may be prevalent prior to the start of treatment, or can be selected under therapy, and furthermore they can persist after cessation of treatment. Additionally, certain DAAs have been approved only for distinct HCV genotypes and may even have subtype specificity. Thus, sequence analysis before start of therapy is instrumental for managing DAA-based treatment strategies. We have created the interpretation system geno2pheno[HCV] (g2p[HCV]) to analyse HCV sequence data with respect to viral subtype and to predict drug resistance. Extensive reviewing and weighting of literature related to HCV drug resistance was performed to create a comprehensive list of drug resistance rules for inhibitors of the HCV protease in non-structural protein 3 (NS3-protease: Boceprevir, Paritaprevir, Simeprevir, Asunaprevir, Grazoprevir and Telaprevir), the NS5A replicase factor (Daclatasvir, Ledipasvir, Elbasvir and Ombitasvir), and the NS5B RNA-dependent RNA polymerase (Dasabuvir and Sofosbuvir). Upon submission of up to eight sequences, g2p[HCV] aligns the input sequences, identifies the genomic region(s), predicts the HCV geno- and subtypes, and generates for each DAA a drug resistance prediction report. g2p[HCV] offers easy-to-use and fast subtype and resistance analysis of HCV sequences, is continuously updated and freely accessible under http://hcv.geno2pheno.org/index.php. The system was partially validated with respect to the NS3-protease inhibitors Boceprevir, Telaprevir and Simeprevir by using data generated with recombinant, phenotypic cell culture assays obtained from patients' virus variants.


Assuntos
Antivirais/uso terapêutico , Farmacorresistência Viral , Hepacivirus/genética , Hepatite C Crônica/tratamento farmacológico , Software , Algoritmos , Linhagem Celular , Estudos de Associação Genética , Genoma Viral , Genótipo , Hepacivirus/efeitos dos fármacos , Humanos , Concentração Inibidora 50 , Internet , Mutação , Oligopeptídeos/administração & dosagem , Fenótipo , Prolina/administração & dosagem , Prolina/análogos & derivados , Simeprevir/administração & dosagem
11.
Bioinformatics ; 29(2): 215-22, 2013 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-23142964

RESUMO

MOTIVATION: Homology detection is a long-standing challenge in computational biology. To tackle this problem, typically all-versus-all BLAST results are coupled with data partitioning approaches resulting in clusters of putative homologous proteins. One of the main problems, however, has been widely neglected: all clustering tools need a density parameter that adjusts the number and size of the clusters. This parameter is crucial but hard to estimate without gold standard data at hand. Developing a gold standard, however, is a difficult and time consuming task. Having a reliable method for detecting clusters of homologous proteins between a huge set of species would open opportunities for better understanding the genetic repertoire of bacteria with different lifestyles. RESULTS: Our main contribution is a method for identifying a suitable and robust density parameter for protein homology detection without a given gold standard. Therefore, we study the core genome of 89 actinobacteria. This allows us to incorporate background knowledge, i.e. the assumption that a set of evolutionarily closely related species should share a comparably high number of evolutionarily conserved proteins (emerging from phylum-specific housekeeping genes). We apply our strategy to find genes/proteins that are specific for certain actinobacterial lifestyles, i.e. different types of pathogenicity. The whole study was performed with transitivity clustering, as it only requires a single intuitive density parameter and has been shown to be well applicable for the task of protein sequence clustering. Note, however, that the presented strategy generally does not depend on our clustering method but can easily be adapted to other clustering approaches. AVAILABILITY: All results are publicly available at http://transclust.mmci.uni-saarland.de/actino_core/ or as Supplementary Material of this article. CONTACT: roettger@mpi-inf.mpg.de SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Actinobacteria/classificação , Proteínas de Bactérias/química , Homologia de Sequência de Aminoácidos , Actinobacteria/genética , Actinobacteria/patogenicidade , Algoritmos , Proteínas de Bactérias/genética , Análise por Conglomerados , Genoma Bacteriano , Modelos Genéticos , Filogenia , Alinhamento de Sequência
12.
Integr Biol (Camb) ; 4(7): 728-33, 2012 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-22318347

RESUMO

Pathogenic Escherichia coli, such as Enterohemorrhagic E. coli (EHEC) and Enteroaggregative E. coli (EAEC), are globally widespread bacteria. Some may cause the hemolytic uremic syndrome (HUS). Varying strains cause epidemics all over the world. Recently, we observed an epidemic outbreak of a multi-resistant EHEC strain in Western Europe, mainly in Germany. The Robert Koch Institute reports >4300 infections and >50 deaths (July, 2011). Farmers lost several million EUR since the origin of infection was unclear. Here, we contribute to the currently ongoing research with a computer-aided study of EHEC transcriptional regulatory interactions, a network of genetic switches that control, for instance, pathogenicity, survival and reproduction of bacterial cells. Our strategy is to utilize knowledge of gene regulatory networks from the evolutionary relative E. coli K-12, a harmless strain mainly used for wet lab studies. In order to provide high-potential candidates for human pathogenic E. coli bacteria, such as EHEC, we developed the integrated online database and an analysis platform EhecRegNet. We utilize 3489 known regulations from E. coli K-12 for predictions of yet unknown gene regulatory interactions in 16 human pathogens. For these strains we predict 40,913 regulatory interactions. EhecRegNet is based on the identification of evolutionarily conserved regulatory sites within the DNA of the harmless E. coli K-12 and the pathogens. Identifying and characterizing EHEC's genetic control mechanism network on a large scale will allow for a better understanding of its survival and infection strategies. This will support the development of urgently needed new treatments. EhecRegNet is online via http://www.ehecregnet.de.


Assuntos
Escherichia coli Êntero-Hemorrágica/genética , Escherichia coli Êntero-Hemorrágica/patogenicidade , Infecções por Escherichia coli/microbiologia , Escherichia coli/genética , Escherichia coli/patogenicidade , Redes Reguladoras de Genes , Biologia Computacional/métodos , Bases de Dados Genéticas , Farmacorresistência Bacteriana , Regulação Bacteriana da Expressão Gênica , Síndrome Hemolítico-Urêmica/microbiologia , Humanos , Internet , Modelos Genéticos , Software , Transcrição Gênica
13.
Int J Cancer ; 127(10): 2374-85, 2010 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-20473924

RESUMO

For neuroblastoma, the most common extracranial tumour of childhood, identification of new biomarkers and potential therapeutic targets is mandatory to improve risk stratification and survival rates. MicroRNAs are deregulated in most cancers, including neuroblastoma. In this study, we analysed 430 miRNAs in 69 neuroblastomas by stem-loop RT-qPCR. Prediction of event-free survival (EFS) with support vector machines (SVM) and actual survival times with Cox regression-based models (CASPAR) were highly accurate and were independently validated. SVM-accuracy for prediction of EFS was 88.7% (95% CI: 88.5-88.8%). For CASPAR-based predictions, 5y-EFS probability was 0.19% (95% CI: 0-38%) in the CASPAR-predicted short survival group compared with 0.78% (95%CI: 64-93%) in the CASPAR-predicted long survival group. Both classifiers were validated on an independent test set yielding accuracies of 94.74% (SVM) and 5y-EFS probabilities as 0.25 (95% CI: 0.0-0.55) for short versus 1 ± 0.0 for long survival (CASPAR), respectively. Amplification of the MYCN oncogene was highly correlated with deregulation of miRNA expression. In addition, 37 miRNAs correlated with TrkA expression, a marker of excellent outcome, and 6 miRNAs further analysed in vitro were regulated upon TrkA transfection, suggesting a functional relationship. Expression of the most significant TrkA-correlated miRNA, miR-542-5p, also discriminated between local and metastatic disease and was inversely correlated with MYCN amplification and event-free survival. We conclude that neuroblastoma patient outcome prediction using miRNA expression is feasible and effective. Studies testing miRNA-based predictors in comparison to and in combination with mRNA and aCGH information should be initiated. Specific miRNAs (e.g., miR-542-5p) might be important in neuroblastoma tumour biology, and qualify as potential therapeutic targets.


Assuntos
MicroRNAs/biossíntese , Neuroblastoma/genética , Algoritmos , Feminino , Perfilação da Expressão Gênica , Humanos , Lactente , Masculino , MicroRNAs/genética , Neuroblastoma/metabolismo , Receptor trkA/biossíntese , Receptor trkA/genética , Reprodutibilidade dos Testes , Reação em Cadeia da Polimerase Via Transcriptase Reversa
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...