Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
IEEE/ACM Trans Comput Biol Bioinform ; 16(4): 1364-1373, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-28166504

RESUMO

Reconstructing ancestral gene orders in a given phylogeny is a classical problem in comparative genomics. Most existing methods compare conserved features in extant genomes in the phylogeny to define potential ancestral gene adjacencies, and either try to reconstruct all ancestral genomes under a global evolutionary parsimony criterion, or, focusing on a single ancestral genome, use a scaffolding approach to select a subset of ancestral gene adjacencies, generally aiming at reducing the fragmentation of the reconstructed ancestral genome. In this paper, we describe an exact algorithm for the Small Parsimony Problem that combines both approaches. We consider that gene adjacencies at internal nodes of the species phylogeny are weighted, and we introduce an objective function defined as a convex combination of these weights and the evolutionary cost under the Single-Cut-or-Join (SCJ) model. The weights of ancestral gene adjacencies can, e.g., be obtained through the recent availability of ancient DNA sequencing data, which provide a direct hint at the genome structure of the considered ancestor, or through probabilistic analysis of gene adjacencies evolution. We show the NP-hardness of our problem variant and propose a Fixed-Parameter Tractable algorithm based on the Sankoff-Rousseau dynamic programming algorithm that also allows to sample co-optimal solutions. We apply our approach to mammalian and bacterial data providing different degrees of complexity. We show that including adjacency weights in the objective has a significant impact in reducing the fragmentation of the reconstructed ancestral gene orders. An implementation is available at http://github.com/nluhmann/PhySca.


Assuntos
Algoritmos , Biologia Computacional/métodos , Genoma Bacteriano , Genômica/métodos , Animais , Evolução Biológica , Simulação por Computador , Bases de Dados Genéticas , Evolução Molecular , Ordem dos Genes , Marcadores Genéticos/genética , Modelos Genéticos , Gambás/genética , Filogenia , Plasmídeos/metabolismo , Probabilidade , Reprodutibilidade dos Testes , Suínos/genética , Yersinia/genética
2.
BMC Bioinformatics ; 16 Suppl 19: S1, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26695008

RESUMO

Finding the smallest sequence of operations to transform one genome into another is an important problem in comparative genomics. The breakpoint graph is a discrete structure that has proven to be effective in solving distance problems, and the number of cycles in a cycle decomposition of this graph is one of the remarkable parameters to help in the solution of related problems. For a fixed k, the number of linear unichromosomal genomes (signed or unsigned) with n elements such that the induced breakpoint graphs have k disjoint cycles, known as the Hultman number, has been already determined. In this work we extend these results to multichromosomal genomes, providing formulas to compute the number of multichromosal genomes having a fixed number of cycles and/or paths. We obtain an explicit formula for circular multichromosomal genomes and recurrences for general multichromosomal genomes, and discuss how these series can be used to calculate the distribution and expected value of the rearrangement distance between random genomes.


Assuntos
Quebra Cromossômica , Cromossomos/genética , Rearranjo Gênico , Algoritmos , DNA Circular/genética , Genoma , Tamanho do Genoma/genética
3.
PLoS One ; 9(8): e105015, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25137074

RESUMO

The elucidation of orthology relationships is an important step both in gene function prediction as well as towards understanding patterns of sequence evolution. Orthology assignments are usually derived directly from sequence similarities for large data because more exact approaches exhibit too high computational costs. Here we present PoFF, an extension for the standalone tool Proteinortho, which enhances orthology detection by combining clustering, sequence similarity, and synteny. In the course of this work, FFAdj-MCS, a heuristic that assesses pairwise gene order using adjacencies (a similarity measure related to the breakpoint distance) was adapted to support multiple linear chromosomes and extended to detect duplicated regions. PoFF largely reduces the number of false positives and enables more fine-grained predictions than purely similarity-based approaches. The extension maintains the low memory requirements and the efficient concurrency options of its basis Proteinortho, making the software applicable to very large datasets.


Assuntos
Modelos Genéticos , Software , Sintenia , Proteínas de Bactérias/genética , Análise por Conglomerados , Simulação por Computador , Conjuntos de Dados como Assunto , Genes Bacterianos
4.
Nucleic Acids Res ; 42(15): 9854-61, 2014 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-25056310

RESUMO

Genomes undergo changes in organization as a result of gene duplications, chromosomal rearrangements and local mutations, among other mechanisms. In contrast to prokaryotes, in which genes of a common function are often organized in operons and reside contiguously along the genome, most eukaryotes show much weaker clustering of genes by function, except for few concrete functional groups. We set out to check systematically if there is a relation between gene function and gene organization in the human genome. We test this question for three types of functional groups: pairs of interacting proteins, complexes and pathways. We find a significant concentration of functional groups both in terms of their distance within the same chromosome and in terms of their dispersal over several chromosomes. Moreover, using Hi-C contact map of the tendency of chromosomal segments to appear close in the 3D space of the nucleus, we show that members of the same functional group that reside on distinct chromosomes tend to co-localize in space. The result holds for all three types of functional groups that we tested. Hence, the human genome shows substantial concentration of functional groups within chromosomes and across chromosomes in space.


Assuntos
Núcleo Celular/genética , Cromossomos Humanos , Genes , Genoma Humano , Humanos , Espaço Intranuclear , Mapeamento de Interação de Proteínas
5.
BMC Bioinformatics ; 13 Suppl 19: S3, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23281826

RESUMO

BACKGROUND: The comparison of relative gene orders between two genomes offers deep insights into functional correlations of genes and the evolutionary relationships between the corresponding organisms. Methods for gene order analyses often require prior knowledge of homologies between all genes of the genomic dataset. Since such information is hard to obtain, it is common to predict homologous groups based on sequence similarity. These hypothetical groups of homologous genes are called gene families. RESULTS: This manuscript promotes a new branch of gene order studies in which prior assignment of gene families is not required. As a case study, we present a new similarity measure between pairs of genomes that is related to the breakpoint distance. We propose an exact and a heuristic algorithm for its computation. We evaluate our methods on a dataset comprising 12 γ-proteobacteria from the literature. CONCLUSIONS: In evaluating our algorithms, we show that the exact algorithm is suitable for computations on small genomes. Moreover, the results of our heuristic are close to those of the exact algorithm. In general, we demonstrate that gene order studies can be improved by direct, gene family assignment-free comparisons.


Assuntos
Ordem dos Genes , Genoma Bacteriano/genética , Genômica/métodos , Família Multigênica , Análise de Sequência de DNA/métodos , Algoritmos , Gammaproteobacteria/genética
6.
J Comput Biol ; 15(8): 1093-115, 2008 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-18774903

RESUMO

Comparing genomes of different species is a fundamental problem in comparative genomics. Recent research has resulted in the introduction of different measures between pairs of genomes: for example, reversal distance, number of breakpoints, and number of common or conserved intervals. However, classical methods used for computing such measures are seriously compromised when genomes have several copies of the same gene scattered across them. Most approaches to overcome this difficulty are based either on the exemplar model, which keeps exactly one copy in each genome of each duplicated gene, or on the maximum matching model, which keeps as many copies as possible of each duplicated gene. The goal is to find an exemplar matching, respectively a maximum matching, that optimizes the studied measure. Unfortunately, it turns out that, in presence of duplications, this problem for each above-mentioned measure is NP-hard. In this paper, we propose to compute the minimum number of breakpoints and the maximum number of adjacencies between two genomes in presence of duplications using two different approaches. The first one is an exact, generic 0-1 linear programming approach, while the second is a collection of three heuristics. Each of these approaches is applied on each problem and for each of the following models: exemplar, maximum matching and intermediate model, that we introduce here. All these programs are run on a well-known public benchmark dataset of gamma-Proteobacteria, and their performances are discussed.


Assuntos
Biologia Computacional/métodos , Genes Duplicados , Genoma Bacteriano/genética , Genômica/métodos , Algoritmos , Gammaproteobacteria/genética , Modelos Genéticos , Família Multigênica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...