Pesquisa | Portal Regional da BVS (teste)

Aligning the unalignable: bacteriophage whole genome alignments.

Bérard, Sèverine; Chateau, Annie; Pompidor, Nicolas; Guertin, Paul; Bergeron, Anne; Swenson, Krister M.

BMC Bioinformatics ; 17: 30, 2016 Jan 13.

Artigo em Inglês | MEDLINE | ID: mdl-26757899

RESUMO

BACKGROUND: In recent years, many studies focused on the description and comparison of large sets of related bacteriophage genomes. Due to the peculiar mosaic structure of these genomes, few informative approaches for comparing whole genomes exist: dot plots diagrams give a mostly qualitative assessment of the similarity/dissimilarity between two or more genomes, and clustering techniques are used to classify genomes. Multiple alignments are conspicuously absent from this scene. Indeed, whole genome aligners interpret lack of similarity between sequences as an indication of rearrangements, insertions, or losses. This behavior makes them ill-prepared to align bacteriophage genomes, where even closely related strains can accomplish the same biological function with highly dissimilar sequences. RESULTS: In this paper, we propose a multiple alignment strategy that exploits functional collinearity shared by related strains of bacteriophages, and uses partial orders to capture mosaicism of sets of genomes. As classical alignments do, the computed alignments can be used to predict that genes have the same biological function, even in the absence of detectable similarity. The Alpha aligner implements these ideas in visual interactive displays, and is used to compute several examples of alignments of Staphylococcus aureus and Mycobacterium bacteriophages, involving up to 29 genomes. Using these datasets, we prove that Alpha alignments are at least as good as those computed by standard aligners. Comparison with the progressive Mauve aligner - which implements a partial order strategy, but whose alignments are linearized - shows a greatly improved interactive graphic display, while avoiding misalignments. CONCLUSIONS: Multiple alignments of whole bacteriophage genomes work, and will become an important conceptual and visual tool in comparative genomics of sets of related strains. A python implementation of Alpha, along with installation instructions for Ubuntu and OSX, is available on bitbucket (https://bitbucket.org/thekswenson/alpha).

Assuntos

Bacteriófagos/genética , Genoma Viral , Mycobacterium/virologia , Alinhamento de Sequência/métodos , Staphylococcus aureus/virologia , Algoritmos , Biologia Computacional/métodos , Genômica/métodos

Assisted transcriptome reconstruction and splicing orthology.

Blanquart, Samuel; Varré, Jean-Stéphane; Guertin, Paul; Perrin, Amandine; Bergeron, Anne; Swenson, Krister M.

BMC Genomics ; 17(Suppl 10): 786, 2016 11 11.

Artigo em Inglês | MEDLINE | ID: mdl-28185551

RESUMO

BACKGROUND: Transcriptome reconstruction, defined as the identification of all protein isoforms that may be expressed by a gene, is a notably difficult computational task. With real data, the best methods based on RNA-seq data identify barely 21 % of the expressed transcripts. While waiting for algorithms and sequencing techniques to improve - as has been strongly suggested in the literature - it is important to evaluate assisted transcriptome prediction; this is the question of how alternative transcription in one species performs as a predictor of protein isoforms in another relatively close species. Most evidence-based gene predictors use transcripts from other species to annotate a genome, but the predictive power of procedures that use exclusively transcripts from external species has never been quantified. The cornerstone of such an evaluation is the correct identification of pairs of transcripts with the same splicing patterns, called splicing orthologs. RESULTS: We propose a rigorous procedural definition of splicing orthologs, based on the identification of all ortholog pairs of splicing sites in the nucleotide sequences, and alignments at the protein level. Using our definition, we compared 24 382 human transcripts and 17 909 mouse transcripts from the highly curated CCDS database, and identified 11 122 splicing orthologs. In prediction mode, we show that human transcripts can be used to infer over 62 % of mouse protein isoforms. When restricting the predictions to transcripts known eight years ago, the percentage grows to 74 %. Using CCDS timestamped releases, we also analyze the evolution of the number of splicing orthologs over the last decade. CONCLUSIONS: Alternative splicing is now recognized to play a major role in the protein diversity of eukaryotic organisms, but definitions of spliced isoform orthologs are still approximate. Here we propose a definition adapted to the subtle variations of conserved alternative splicing sites, and use it to validate numerous accurate orthologous isoform predictions.

Assuntos

Algoritmos , Proteínas/genética , Transcriptoma , Processamento Alternativo , Animais , Biologia Computacional , Éxons , Humanos , Camundongos , Isoformas de Proteínas/química , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Proteínas/química , Proteínas/metabolismo , RNA/química , RNA/genética , RNA/metabolismo

Reconstructing the modular recombination history of Staphylococcus aureus phages.

Swenson, Krister M; Guertin, Paul; Deschênes, Hugo; Bergeron, Anne.

BMC Bioinformatics ; 14 Suppl 15: S17, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-24564731

RESUMO

BACKGROUND: Viruses that infect bacteria, called phages, are well-known for their extreme mosaicism, in which an individual genome shares many different parts with many others. The mechanisms for creating these mosaics are largely unknown but are believed to be recombinations, either illegitimate, or partly homologous. In order to reconstruct the history of these recombinations, we need to identify the positions where recombinations may have occurred, and develop algorithms to generate and explore the possible reconstructions. RESULTS: We first show that, provided that their gene order is co-linear, genomes of phages can be aligned, even if large parts of their sequences lack any detectable similarity and are annotated hypothetical proteins. We give such an alignment for 31 Staphylococcus aureus phage genomes, and algorithms that can be used in any similar context. These alignments provide the datasets needed for a combinatorial study of recombinations. We next reconstruct the most likely recombination history of the set of 31 phages, under the hypothesis that recombinations are partly homologous. This history relies on the computational identification of missing phages. CONCLUSIONS: This first combinatorial study of modular recombinations acts as a proof of concept. We show that alignments of whole genomes are feasible for large sets of phages, and that this representation yields data that can be used to reconstruct parts of the evolutionary history of these organisms.

Assuntos

Bacteriófagos/genética , Genoma Viral , Recombinação Genética , Staphylococcus aureus/virologia , Algoritmos , Análise de Sequência de DNA , Staphylococcus aureus/genética

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA