Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
1.
Commun Biol ; 4(1): 1060, 2021 09 10.
Artigo em Inglês | MEDLINE | ID: mdl-34508155

RESUMO

Prediction of T-cell receptor (TCR) interactions with MHC-peptide complexes remains highly challenging. This challenge is primarily due to three dominant factors: data accuracy, data scarceness, and problem complexity. Here, we showcase that "shallow" convolutional neural network (CNN) architectures are adequate to deal with the problem complexity imposed by the length variations of TCRs. We demonstrate that current public bulk CDR3ß-pMHC binding data overall is of low quality and that the development of accurate prediction models is contingent on paired α/ß TCR sequence data corresponding to at least 150 distinct pairs for each investigated pMHC. In comparison, models trained on CDR3α or CDR3ß data alone demonstrated a variable and pMHC specific relative performance drop. Together these findings support that T-cell specificity is predictable given the availability of accurate and sufficient paired TCR sequence data. NetTCR-2.0 is publicly available at https://services.healthtech.dtu.dk/service.php?NetTCR-2.0 .


Assuntos
Redes Neurais de Computação , Receptores de Antígenos de Linfócitos T/química , Ligação Proteica
2.
Sci Rep ; 10(1): 21523, 2020 12 09.
Artigo em Inglês | MEDLINE | ID: mdl-33299076

RESUMO

Complications of atherosclerosis are the leading cause of morbidity and mortality worldwide. Various genetically modified mouse models are used to investigate disease trajectory with classical histology, currently the preferred methodology to elucidate plaque composition. Here, we show the strength of light-sheet fluorescence microscopy combined with deep learning image analysis for characterising and quantifying plaque burden and composition in whole aorta specimens. 3D imaging is a non-destructive method that requires minimal ex vivo handling and can be up-scaled to large sample sizes. Combined with deep learning, atherosclerotic plaque in mice can be identified without any ex vivo staining due to the autofluorescent nature of the tissue. The aorta and its branches can subsequently be segmented to determine how anatomical position affects plaque composition and progression. Here, we find the highest plaque accumulation in the aortic arch and brachiocephalic artery. Simultaneously, aortas can be stained for markers of interest (for example the pan immune cell marker CD45) and quantified. In ApoE-/- mice we observe that levels of CD45 reach a plateau after which increases in plaque volume no longer correlate to immune cell infiltration. All underlying code is made publicly available to ease adaption of the method.


Assuntos
Placa Aterosclerótica/diagnóstico por imagem , Placa Aterosclerótica/metabolismo , Placa Aterosclerótica/patologia , Animais , Aorta/patologia , Doenças da Aorta , Apolipoproteínas E/análise , Aterosclerose/complicações , Aterosclerose/patologia , Aprendizado Profundo , Modelos Animais de Doenças , Feminino , Processamento de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Camundongos Knockout , Microscopia de Fluorescência/métodos , Receptores de LDL/análise
3.
Dis Model Mech ; 12(11)2019 11 22.
Artigo em Inglês | MEDLINE | ID: mdl-31704726

RESUMO

Parkinson's disease (PD) is a basal ganglia movement disorder characterized by progressive degeneration of the nigrostriatal dopaminergic system. Immunohistochemical methods have been widely used for characterization of dopaminergic neuronal injury in animal models of PD, including the MPTP (1-methyl-4-phenyl-1,2,3,6-tetrahydropyridine) mouse model. However, conventional immunohistochemical techniques applied to tissue sections have inherent limitations with respect to loss of 3D resolution, yielding insufficient information on the architecture of the dopaminergic system. To provide a more comprehensive and non-biased map of MPTP-induced changes in central dopaminergic pathways, we used iDISCO immunolabeling, light-sheet fluorescence microscopy (LSFM) and deep-learning computational methods for whole-brain three-dimensional visualization and automated quantitation of tyrosine hydroxylase (TH)-positive neurons in the adult mouse brain. Mice terminated 7 days after acute MPTP administration demonstrated widespread alterations in TH expression. Compared to vehicle controls, MPTP-dosed mice showed a significant loss of TH-positive neurons in the substantia nigra pars compacta and ventral tegmental area. Also, MPTP dosing reduced overall TH signal intensity in basal ganglia nuclei, i.e. the substantia nigra, caudate-putamen, globus pallidus and subthalamic nucleus. In contrast, increased TH signal intensity was predominantly observed in limbic regions, including several subdivisions of the amygdala and hypothalamus. In conclusion, mouse whole-brain 3D imaging is ideal for unbiased automated counting and densitometric analysis of TH-positive cells. The LSFM-deep learning pipeline tracked brain-wide changes in catecholaminergic pathways in the MPTP mouse model of PD, and may be applied for preclinical characterization of compounds targeting dopaminergic neurotransmission.


Assuntos
Encéfalo/diagnóstico por imagem , Modelos Animais de Doenças , Imageamento Tridimensional/métodos , Neurônios/enzimologia , Doença de Parkinson/diagnóstico por imagem , Tirosina 3-Mono-Oxigenase/análise , Animais , Aprendizado Profundo , Intoxicação por MPTP/diagnóstico por imagem , Camundongos , Microscopia de Fluorescência , Destreza Motora , Doença de Parkinson/enzimologia
4.
Sci Rep ; 9(1): 14530, 2019 10 10.
Artigo em Inglês | MEDLINE | ID: mdl-31601838

RESUMO

The interaction between the class I major histocompatibility complex (MHC), the peptide presented by the MHC and the T-cell receptor (TCR) is a key determinant of the cellular immune response. Here, we present TCRpMHCmodels, a method for accurate structural modelling of the TCR-peptide-MHC (TCR-pMHC) complex. This TCR-pMHC modelling pipeline takes as input the amino acid sequence and generates models of the TCR-pMHC complex, with a median Cα RMSD of 2.31 Å. TCRpMHCmodels significantly outperforms TCRFlexDock, a specialised method for docking pMHC and TCR structures. TCRpMHCmodels is simple to use and the modelling pipeline takes, on average, only two minutes. Thanks to its ease of use and high modelling accuracy, we expect TCRpMHCmodels to provide insights into the underlying mechanisms of TCR and pMHC interactions and aid in the development of advanced T-cell-based immunotherapies and rational design of vaccines. The TCRpMHCmodels tool is available at http://www.cbs.dtu.dk/services/TCRpMHCmodels/ .


Assuntos
Antígenos de Histocompatibilidade Classe I/química , Modelos Moleculares , Receptores de Antígenos de Linfócitos T/química , Antígenos/química , Biologia Computacional , Bases de Dados de Proteínas , Epitopos/química , Humanos , Sistema Imunitário , Peptídeos/química , Linfócitos T/imunologia
5.
Nucleic Acids Res ; 47(W1): W502-W506, 2019 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-31114900

RESUMO

The Immune Epitope Database Analysis Resource (IEDB-AR, http://tools.iedb.org/) is a companion website to the IEDB that provides computational tools focused on the prediction and analysis of B and T cell epitopes. All of the tools are freely available through the public website and many are also available through a REST API and/or a downloadable command-line tool. A virtual machine image of the entire site is also freely available for non-commercial use and contains most of the tools on the public site. Here, we describe the tools and functionalities that are available in the IEDB-AR, focusing on the 10 new tools that have been added since the last report in the 2012 NAR webserver edition. In addition, many of the tools that were already hosted on the site in 2012 have received updates to newest versions, including NetMHC, NetMHCpan, BepiPred and DiscoTope. Overall, this IEDB-AR update provides a substantial set of updated and novel features for epitope prediction and analysis.


Assuntos
Epitopos de Linfócito B/química , Epitopos de Linfócito T/química , Software , Animais , Bases de Dados de Proteínas , Epitopos de Linfócito B/imunologia , Epitopos de Linfócito T/imunologia , Antígenos de Histocompatibilidade/metabolismo , Humanos , Camundongos
6.
Proteins ; 87(6): 520-527, 2019 06.
Artigo em Inglês | MEDLINE | ID: mdl-30785653

RESUMO

The ability to predict local structural features of a protein from the primary sequence is of paramount importance for unraveling its function in absence of experimental structural information. Two main factors affect the utility of potential prediction tools: their accuracy must enable extraction of reliable structural information on the proteins of interest, and their runtime must be low to keep pace with sequencing data being generated at a constantly increasing speed. Here, we present NetSurfP-2.0, a novel tool that can predict the most important local structural features with unprecedented accuracy and runtime. NetSurfP-2.0 is sequence-based and uses an architecture composed of convolutional and long short-term memory neural networks trained on solved protein structures. Using a single integrated model, NetSurfP-2.0 predicts solvent accessibility, secondary structure, structural disorder, and backbone dihedral angles for each residue of the input sequences. We assessed the accuracy of NetSurfP-2.0 on several independent test datasets and found it to consistently produce state-of-the-art predictions for each of its output features. We observe a correlation of 80% between predictions and experimental data for solvent accessibility, and a precision of 85% on secondary structure 3-class predictions. In addition to improved accuracy, the processing time has been optimized to allow predicting more than 1000 proteins in less than 2 hours, and complete proteomes in less than 1 day.


Assuntos
Bases de Dados de Proteínas , Aprendizado Profundo , Biologia Computacional , Estrutura Secundária de Proteína , Proteoma/química
7.
Methods Mol Biol ; 1878: 157-172, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30378075

RESUMO

Cancer immunotherapy has experienced several major breakthroughs in the past decade. Most recently, technical advances in next-generation sequencing methods have enabled discovery of tumor-specific mutations leading to protective T cell neoepitopes. Many of the successes are enabled by computational methods, which facilitate processing of raw data, mapping of mutations, and prediction of neoepitopes. In this book chapter, we provide an overview of the computational tasks related to the identification of neoepitopes, propose specific tools and best practices, and discuss strengths, weaknesses, and future challenges.


Assuntos
Epitopos/genética , Epitopos/imunologia , Neoplasias/genética , Neoplasias/imunologia , Linfócitos T/imunologia , Biologia Computacional/métodos , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Imunoterapia/métodos , Mutação/genética
8.
Bioinformatics ; 35(7): 1098-1107, 2019 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-30169744

RESUMO

MOTIVATION: Understanding the specificity of protein receptor-ligand interactions is pivotal for our comprehension of biological mechanisms and systems. Receptor protein families often have a certain level of sequence diversity that converges into fewer conserved protein structures, allowing the exertion of well-defined functions. T and B cell receptors of the immune system and protein kinases that control the dynamic behaviour and decision processes in eukaryotic cells by catalysing phosphorylation represent prime examples. Driven by the large sequence diversity, the receptors within such protein families are often found to share specificities although divergent at the sequence level. This observation has led to the notion that prediction models of such systems are most effectively handled in a receptor-specific manner. RESULTS: We show that this approach in many cases is suboptimal, and describe an alternative improved framework for generating models with pan-receptor-predictive power for receptor protein families. The framework is based on deep artificial neural networks and integrates information from individual receptors into a single pan-receptor model, leveraging information across multiple receptor-specific datasets allowing predictions of the receptor specificity for all members of a given protein family including those described by limited or no ligand data. The approach was applied to the protein kinase superfamily, leading to the method NetPhosPan. The method was extensively validated and benchmarked against state-of-the-art prediction methods and was found to have unprecedented performance in particularly for kinase domains characterized by limited or no experimental data. AVAILABILITY AND IMPLEMENTATION: The method is freely available to non-commercial users and can be downloaded at http://www.cbs.dtu.dk/services/NetPhospan-1.0. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Redes Neurais de Computação , Ligantes , Fosforilação , Proteínas Quinases , Proteínas
9.
Front Immunol ; 9: 1795, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30127785

RESUMO

CD4+ T cells have a major role in regulating immune responses. They are activated by recognition of peptides mostly generated from exogenous antigens through the major histocompatibility complex (MHC) class II pathway. Identification of epitopes is important and computational prediction of epitopes is used widely to save time and resources. Although there are algorithms to predict binding affinity of peptides to MHC II molecules, no accurate methods exist to predict which ligands are generated as a result of natural antigen processing. We utilized a dataset of around 14,000 naturally processed ligands identified by mass spectrometry of peptides eluted from MHC class II expressing cells to investigate the existence of sequence signatures potentially related to the cleavage mechanisms that liberate the presented peptides from their source antigens. This analysis revealed preferred amino acids surrounding both N- and C-terminuses of ligands, indicating sequence-specific cleavage preferences. We used these cleavage motifs to develop a method for predicting naturally processed MHC II ligands, and validated that it had predictive power to identify ligands from independent studies. We further confirmed that prediction of ligands based on cleavage motifs could be combined with predictions of MHC binding, and that the combined prediction had superior performance. However, when attempting to predict CD4+ T cell epitopes, either alone or in combination with MHC binding predictions, predictions based on the cleavage motifs did not show predictive power. Given that peptides identified as epitopes based on CD4+ T cell reactivity typically do not have well-defined termini, it is possible that motifs are present but outside of the mapped epitope. Our attempts to take that into account computationally did not show any sign of an increased presence of cleavage motifs around well-characterized CD4+ T cell epitopes. While it is possible that our attempts to translate the cleavage motifs in MHC II ligand elution data into T cell epitope predictions were suboptimal, other possible explanations are that the cleavage signal is too diluted to be detected, or that elution data are enriched for ligands generated through an antigen processing and presentation pathway that is less frequently utilized for T cell epitopes.


Assuntos
Algoritmos , Apresentação de Antígeno , Linfócitos T CD4-Positivos/metabolismo , Epitopos de Linfócito T/metabolismo , Antígenos de Histocompatibilidade Classe II/metabolismo , Motivos de Aminoácidos , Aminoácidos/metabolismo , Sítios de Ligação , Linfócitos T CD4-Positivos/imunologia , Bases de Dados de Proteínas , Conjuntos de Dados como Assunto , Antígenos de Histocompatibilidade Classe II/imunologia , Humanos , Ligantes , Espectrometria de Massas , Peptídeos/metabolismo , Ligação Proteica , Proteólise
10.
Front Immunol ; 9: 1007, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29795801

RESUMO

[This corrects the article on p. 1566 in vol. 8, PMID: 29187854.].

11.
Cancer Immunol Res ; 6(6): 636-644, 2018 06.
Artigo em Inglês | MEDLINE | ID: mdl-29615400

RESUMO

With the advancement of personalized cancer immunotherapies, new tools are needed to identify tumor antigens and evaluate T-cell responses in model systems, specifically those that exhibit clinically relevant tumor progression. Key transgenic mouse models of breast cancer are generated and maintained on the FVB genetic background, and one such model is the mouse mammary tumor virus-polyomavirus middle T antigen (MMTV-PyMT) mouse-an immunocompetent transgenic mouse that exhibits spontaneous mammary tumor development and metastasis with high penetrance. Backcrossing the MMTV-PyMT mouse from the FVB strain onto a C57BL/6 genetic background, in order to leverage well-developed C57BL/6 immunologic tools, results in delayed tumor development and variable metastatic phenotypes. Therefore, we initiated characterization of the FVB MHC class I H-2q haplotype to establish useful immunologic tools for evaluating antigen specificity in the murine FVB strain. Our study provides the first detailed molecular and immunoproteomic characterization of the FVB H-2q MHC class I alleles, including >8,500 unique peptide ligands, a multiallele murine MHC peptide prediction tool, and in vivo validation of these data using MMTV-PyMT primary tumors. This work allows researchers to rapidly predict H-2 peptide ligands for immune testing, including, but not limited to, the MMTV-PyMT model for metastatic breast cancer. Cancer Immunol Res; 6(6); 636-44. ©2018 AACR.


Assuntos
Biologia Computacional/métodos , Mapeamento de Epitopos/métodos , Epitopos/imunologia , Antígenos de Histocompatibilidade/imunologia , Neoplasias/imunologia , Peptídeos/imunologia , Software , Sequência de Aminoácidos , Animais , Sítios de Ligação , Linhagem Celular Tumoral , Cromatografia Líquida , Modelos Animais de Doenças , Feminino , Antígenos H-2/química , Antígenos H-2/genética , Antígenos H-2/imunologia , Haplótipos , Humanos , Ligantes , Neoplasias Mamárias Animais , Neoplasias Mamárias Experimentais , Espectrometria de Massas , Camundongos , Ligação Proteica
12.
Front Immunol ; 8: 1566, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-29187854

RESUMO

Personalization of cancer immunotherapies such as therapeutic vaccines and adoptive T-cell therapy may benefit from efficient identification and targeting of patient-specific neoepitopes. However, current neoepitope prediction methods based on sequencing and predictions of epitope processing and presentation result in a low rate of validation, suggesting that the determinants of peptide immunogenicity are not well understood. We gathered published data on human neopeptides originating from single amino acid substitutions for which T cell reactivity had been experimentally tested, including both immunogenic and non-immunogenic neopeptides. Out of 1,948 neopeptide-HLA (human leukocyte antigen) combinations from 13 publications, 53 were reported to elicit a T cell response. From these data, we found an enrichment for responses among peptides of length 9. Even though the peptides had been pre-selected based on presumed likelihood of being immunogenic, we found using NetMHCpan-4.0 that immunogenic neopeptides were predicted to bind significantly more strongly to HLA compared to non-immunogenic peptides. Investigation of the HLA binding strength of the immunogenic peptides revealed that the vast majority (96%) shared very strong predicted binding to HLA and that the binding strength was comparable to that observed for pathogen-derived epitopes. Finally, we found that neopeptide dissimilarity to self is a predictor of immunogenicity in situations where neo- and normal peptides share comparable predicted binding strength. In conclusion, these results suggest new strategies for prioritization of mutated peptides, but new data will be needed to confirm their value.

13.
J Immunol ; 199(9): 3360-3368, 2017 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-28978689

RESUMO

Cytotoxic T cells are of central importance in the immune system's response to disease. They recognize defective cells by binding to peptides presented on the cell surface by MHC class I molecules. Peptide binding to MHC molecules is the single most selective step in the Ag-presentation pathway. Therefore, in the quest for T cell epitopes, the prediction of peptide binding to MHC molecules has attracted widespread attention. In the past, predictors of peptide-MHC interactions have primarily been trained on binding affinity data. Recently, an increasing number of MHC-presented peptides identified by mass spectrometry have been reported containing information about peptide-processing steps in the presentation pathway and the length distribution of naturally presented peptides. In this article, we present NetMHCpan-4.0, a method trained on binding affinity and eluted ligand data leveraging the information from both data types. Large-scale benchmarking of the method demonstrates an increase in predictive performance compared with state-of-the-art methods when it comes to identification of naturally processed ligands, cancer neoantigens, and T cell epitopes.


Assuntos
Bases de Dados de Proteínas , Epitopos de Linfócito T/imunologia , Antígenos de Histocompatibilidade Classe I/imunologia , Peptídeos/imunologia , Software , Humanos , Valor Preditivo dos Testes
14.
Bioinformatics ; 33(22): 3685-3690, 2017 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-28961695

RESUMO

MOTIVATION: Deep neural network architectures such as convolutional and long short-term memory networks have become increasingly popular as machine learning tools during the recent years. The availability of greater computational resources, more data, new algorithms for training deep models and easy to use libraries for implementation and training of neural networks are the drivers of this development. The use of deep learning has been especially successful in image recognition; and the development of tools, applications and code examples are in most cases centered within this field rather than within biology. RESULTS: Here, we aim to further the development of deep learning methods within biology by providing application examples and ready to apply and adapt code templates. Given such examples, we illustrate how architectures consisting of convolutional and long short-term memory neural networks can relatively easily be designed and trained to state-of-the-art performance on three biological sequence problems: prediction of subcellular localization, protein secondary structure and the binding of peptides to MHC Class II molecules. AVAILABILITY AND IMPLEMENTATION: All implementations and datasets are available online to the scientific community at https://github.com/vanessajurtz/lasagne4bio. CONTACT: skaaesonderby@gmail.com. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Aprendizado de Máquina , Estrutura Secundária de Proteína , Transporte Proteico , Análise de Sequência de Proteína/métodos , Biologia Computacional/métodos , Redes Neurais de Computação , Peptídeos/metabolismo , Ligação Proteica
15.
Immunology ; 152(2): 255-264, 2017 10.
Artigo em Inglês | MEDLINE | ID: mdl-28542831

RESUMO

MHC class II molecules play a fundamental role in the cellular immune system: they load short peptide fragments derived from extracellular proteins and present them on the cell surface. It is currently thought that the peptide binds lying more or less flat in the MHC groove, with a fixed distance of nine amino acids between the first and last residue in contact with the MHCII. While confirming that the great majority of peptides bind to the MHC using this canonical mode, we report evidence for an alternative, less common mode of interaction. A fraction of observed ligands were shown to have an unconventional spacing of the anchor residues that directly interact with the MHC, which could only be accommodated to the canonical MHC motif either by imposing a more stretched out peptide backbone (an 8mer core) or by the peptide bulging out of the MHC groove (a 10mer core). We estimated that on average 2% of peptides bind with a core deletion, and 0·45% with a core insertion, but the frequency of such non-canonical cores was as high as 10% for certain MHCII molecules. A mutational analysis and experimental validation of a number of these anomalous ligands demonstrated that they could only fit to their MHC binding motif with a non-canonical binding core of length different from nine. This previously undescribed mode of peptide binding to MHCII molecules gives a more complete picture of peptide presentation by MHCII and allows us to model more accurately this event.


Assuntos
Antígenos de Histocompatibilidade Classe II/metabolismo , Aprendizado de Máquina , Redes Neurais de Computação , Peptídeos/metabolismo , Anticorpos Monoclonais/imunologia , Anticorpos Monoclonais/metabolismo , Sítios de Ligação , Biologia Computacional , Bases de Dados de Proteínas , Epitopos , Antígenos de Histocompatibilidade Classe II/química , Antígenos de Histocompatibilidade Classe II/imunologia , Humanos , Ligantes , Mutação , Peptídeos/química , Peptídeos/genética , Peptídeos/imunologia , Ligação Proteica , Domínios e Motivos de Interação entre Proteínas , Relação Estrutura-Atividade
16.
PLoS One ; 11(9): e0163111, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27684958

RESUMO

Bacteriophages are the most abundant biological entity on the planet, but at the same time do not account for much of the genetic material isolated from most environments due to their small genome sizes. They also show great genetic diversity and mosaic genomes making it challenging to analyze and understand them. Here we present MetaPhinder, a method to identify assembled genomic fragments (i.e.contigs) of phage origin in metagenomic data sets. The method is based on a comparison to a database of whole genome bacteriophage sequences, integrating hits to multiple genomes to accomodate for the mosaic genome structure of many bacteriophages. The method is demonstrated to out-perform both BLAST methods based on single hits and methods based on k-mer comparisons. MetaPhinder is available as a web service at the Center for Genomic Epidemiology https://cge.cbs.dtu.dk/services/MetaPhinder/, while the source code can be downloaded from https://bitbucket.org/genomicepidemiology/metaphinder or https://github.com/vanessajurtz/MetaPhinder.

17.
PLoS One ; 11(6): e0157718, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27327771

RESUMO

Recent advances in whole genome sequencing have made the technology available for routine use in microbiological laboratories. However, a major obstacle for using this technology is the availability of simple and automatic bioinformatics tools. Based on previously published and already available web-based tools we developed a single pipeline for batch uploading of whole genome sequencing data from multiple bacterial isolates. The pipeline will automatically identify the bacterial species and, if applicable, assemble the genome, identify the multilocus sequence type, plasmids, virulence genes and antimicrobial resistance genes. A short printable report for each sample will be provided and an Excel spreadsheet containing all the metadata and a summary of the results for all submitted samples can be downloaded. The pipeline was benchmarked using datasets previously used to test the individual services. The reported results enable a rapid overview of the major results, and comparing that to the previously found results showed that the platform is reliable and able to correctly predict the species and find most of the expected genes automatically. In conclusion, a combined bioinformatics platform was developed and made publicly available, providing easy-to-use automated analysis of bacterial whole genome sequencing data. The platform may be of immediate relevance as a guide for investigators using whole genome sequencing for clinical diagnostics and surveillance. The platform is freely available at: https://cge.cbs.dtu.dk/services/CGEpipeline-1.1 and it is the intention that it will continue to be expanded with new features as these become available.


Assuntos
Bactérias/genética , Técnicas e Procedimentos Diagnósticos , Genoma Bacteriano , Análise de Sequência de DNA/métodos , Estatística como Assunto , Algoritmos , Bactérias/patogenicidade , Sequência de Bases , Plasmídeos/metabolismo , Software , Especificidade da Espécie , Fatores de Tempo , Virulência/genética
18.
Viruses ; 8(5)2016 05 04.
Artigo em Inglês | MEDLINE | ID: mdl-27153081

RESUMO

The current dramatic increase of antibiotic resistant bacteria has revitalised the interest in bacteriophages as alternative antibacterial treatment. Meanwhile, the development of bioinformatics methods for analysing genomic data places high-throughput approaches for phage characterization within reach. Here, we present HostPhinder, a tool aimed at predicting the bacterial host of phages by examining the phage genome sequence. Using a reference database of 2196 phages with known hosts, HostPhinder predicts the host species of a query phage as the host of the most genomically similar reference phages. As a measure of genomic similarity the number of co-occurring k-mers (DNA sequences of length k) is used. Using an independent evaluation set, HostPhinder was able to correctly predict host genus and species for 81% and 74% of the phages respectively, giving predictions for more phages than BLAST and significantly outperforming BLAST on phages for which both had predictions. HostPhinder predictions on phage draft genomes from the INTESTI phage cocktail corresponded well with the advertised targets of the cocktail. Our study indicates that for most phages genomic similarity correlates well with related bacterial hosts. HostPhinder is available as an interactive web service [1] and as a stand alone download from the Docker registry [2].


Assuntos
Bactérias/virologia , Bacteriófagos/genética , Bacteriófagos/fisiologia , Biologia Computacional/métodos , Genoma Viral , Especificidade de Hospedeiro
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...