Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Algorithms Mol Biol ; 18(1): 16, 2023 Nov 08.
Artigo em Inglês | MEDLINE | ID: mdl-37940998

RESUMO

BACKGROUND: Evolutionary scenarios describing the evolution of a family of genes within a collection of species comprise the mapping of the vertices of a gene tree T to vertices and edges of a species tree S. The relative timing of the last common ancestors of two extant genes (leaves of T) and the last common ancestors of the two species (leaves of S) in which they reside is indicative of horizontal gene transfers (HGT) and ancient duplications. Orthologous gene pairs, on the other hand, require that their last common ancestors coincides with a corresponding speciation event. The relative timing information of gene and species divergences is captured by three colored graphs that have the extant genes as vertices and the species in which the genes are found as vertex colors: the equal-divergence-time (EDT) graph, the later-divergence-time (LDT) graph and the prior-divergence-time (PDT) graph, which together form an edge partition of the complete graph. RESULTS: Here we give a complete characterization in terms of informative and forbidden triples that can be read off the three graphs and provide a polynomial time algorithm for constructing an evolutionary scenario that explains the graphs, provided such a scenario exists. While both LDT and PDT graphs are cographs, this is not true for the EDT graph in general. We show that every EDT graph is perfect. While the information about LDT and PDT graphs is necessary to recognize EDT graphs in polynomial-time for general scenarios, this extra information can be dropped in the HGT-free case. However, recognition of EDT graphs without knowledge of putative LDT and PDT graphs is NP-complete for general scenarios. In contrast, PDT graphs can be recognized in polynomial-time. We finally connect the EDT graph to the alternative definitions of orthology that have been proposed for scenarios with horizontal gene transfer. With one exception, the corresponding graphs are shown to be colored cographs.

2.
J Math Biol ; 83(1): 10, 2021 07 03.
Artigo em Inglês | MEDLINE | ID: mdl-34218334

RESUMO

Several implicit methods to infer horizontal gene transfer (HGT) focus on pairs of genes that have diverged only after the divergence of the two species in which the genes reside. This situation defines the edge set of a graph, the later-divergence-time (LDT) graph, whose vertices correspond to genes colored by their species. We investigate these graphs in the setting of relaxed scenarios, i.e., evolutionary scenarios that encompass all commonly used variants of duplication-transfer-loss scenarios in the literature. We characterize LDT graphs as a subclass of properly vertex-colored cographs, and provide a polynomial-time recognition algorithm as well as an algorithm to construct a relaxed scenario that explains a given LDT. An edge in an LDT graph implies that the two corresponding genes are separated by at least one HGT event. The converse is not true, however. We show that the complete xenology relation is described by an rs-Fitch graph, i.e., a complete multipartite graph satisfying constraints on the vertex coloring. This class of vertex-colored graphs is also recognizable in polynomial time. We finally address the question "how much information about all HGT events is contained in LDT graphs" with the help of simulations of evolutionary scenarios with a wide range of duplication, loss, and HGT events. In particular, we show that a simple greedy graph editing scheme can be used to efficiently detect HGT events that are implicitly contained in LDT graphs.


Assuntos
Algoritmos , Transferência Genética Horizontal , Filogenia
3.
Virus Evol ; 6(2): veaa033, 2020 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-32704383

RESUMO

The genealogy of the hepatitis C virus (HCV) and the genus Hepacivirus remains elusive despite numerous recently discovered animal hepaciviruses (HVs). Viruses from evolutionarily ancient mammals might elucidate the HV macro-evolutionary patterns. Here, we investigated sixty-seven two-toed and nine three-toed sloths from Costa Rica for HVs using molecular and serological tools. A novel sloth HV was detected by reverse transcription polymerase chain reaction (RT-PCR) in three-toed sloths (2/9, 22.2%; 95% confidence interval (CI), 5.3-55.7). Genomic characterization revealed typical HV features including overall polyprotein gene structure, a type 4 internal ribosomal entry site in the viral 5'-genome terminus, an A-U-rich region and X-tail structure in the viral 3'-genome terminus. Different from other animal HVs, HV seropositivity in two-toed sloths was low at 4.5 per cent (3/67; CI, 1.0-12.9), whereas the RT-PCR-positive three-toed sloths were seronegative. Limited cross-reactivity of the serological assay implied exposure of seropositive two-toed sloths to HVs of unknown origin and recent infections in RT-PCR-positive animals preceding seroconversion. Recent infections were consistent with only 9 nucleotide exchanges between the two sloth HVs, located predominantly within the E1/E2 encoding regions. Translated sequence distances of NS3 and NS5 proteins and host comparisons suggested that the sloth HV represents a novel HV species. Event- and sequence distance-based reconciliations of phylogenies of HVs and of their hosts revealed complex macro-evolutionary patterns, including both long-term evolutionary associations and host switches, most strikingly from rodents into sloths. Ancestral state reconstructions corroborated rodents as predominant sources of HV host switches during the genealogy of extant HVs. Sequence distance comparisons, partial conservation of critical amino acid residues associated with HV entry and selection pressure signatures of host genes encoding entry and antiviral protein orthologs were consistent with HV host switches between genetically divergent mammals, including the projected host switch from rodents into sloths. Structural comparison of HCV and sloth HV E2 proteins suggested conserved modes of hepaciviral entry. Our data corroborate complex macro-evolutionary patterns shaping the genus Hepacivirus, highlight that host switches are possible across highly diverse host taxa, and elucidate a prominent role of rodent hosts during the Hepacivirus genealogy.

4.
PLoS Pathog ; 15(12): e1008224, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31830128

RESUMO

The spectrum of viruses in insects is important for subjects as diverse as public health, veterinary medicine, food production, and biodiversity conservation. The traditional interest in vector-borne diseases of humans and livestock has drawn the attention of virus studies to hematophagous insect species. However, these represent only a tiny fraction of the broad diversity of Hexapoda, the most speciose group of animals. Here, we systematically probed the diversity of negative strand RNA viruses in the largest and most representative collection of insect transcriptomes from samples representing all 34 extant orders of Hexapoda and 3 orders of Entognatha, as well as outgroups, altogether representing 1243 species. Based on profile hidden Markov models we detected 488 viral RNA-directed RNA polymerase (RdRp) sequences with similarity to negative strand RNA viruses. These were identified in members of 324 arthropod species. Selection for length, quality, and uniqueness left 234 sequences for analyses, showing similarity to genomes of viruses classified in Bunyavirales (n = 86), Articulavirales (n = 54), and several orders within Haploviricotina (n = 94). Coding-complete genomes or nearly-complete subgenomic assemblies were obtained in 61 cases. Based on phylogenetic topology and the availability of coding-complete genomes we estimate that at least 20 novel viral genera in seven families need to be defined, only two of them monospecific. Seven additional viral clades emerge when adding sequences from the present study to formerly monospecific lineages, potentially requiring up to seven additional genera. One long sequence may indicate a novel family. For segmented viruses, cophylogenies between genome segments were generally improved by the inclusion of viruses from the present study, suggesting that in silico misassembly of segmented genomes is rare or absent. Contrary to previous assessments, significant virus-host codivergence was identified in major phylogenetic lineages based on two different approaches of codivergence analysis in a hypotheses testing framework. In spite of these additions to the known spectrum of viruses in insects, we caution that basing taxonomic decisions on genome information alone is challenging due to technical uncertainties, such as the inability to prove integrity of complete genome assemblies of segmented viruses.


Assuntos
Insetos/virologia , Infecções por Vírus de RNA/virologia , Vírus de RNA , Animais
5.
Mol Ecol ; 28(17): 4118-4133, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-31232488

RESUMO

Plant-pollinator interactions are often highly specialised, which may be a consequence of co-evolution. Yet when plants and pollinators co-evolve, it is not clear if this will also result in frequent cospeciation. Here, we investigate the mutual evolutionary history of South African oil-collecting Rediviva bees and their Diascia host plants, in which the elongated forelegs of female Rediviva have been suggested to coevolve with the oil-producing spurs of their Diascia hosts. After controlling for phylogenetic nonindependence, we found Rediviva foreleg length to be significantly correlated with Diascia spur length, suggestive of co-evolution. However, as trait correlation could also be due to pollinator shifts, we tested if cospeciation or pollinator shifts have dominated the evolution of Rediviva-Diascia interactions by analysing phylogenies in a cophylogenetic framework. Distance-based cophylogenetic analyses (PARAFIT, PACo) indicated significant congruence of the two phylogenies under most conditions. Yet, we found that phylogenetic relatedness was correlated with ecological similarity (the spectrum of partners that each taxon interacted with) only for Diascia but not for Rediviva, suggesting that phylogenetic congruence might be due to phylogenetic tracking by Diascia of Rediviva rather than strict (reciprocal) co-evolution. Furthermore, event-based reconciliation using a parsimony approach (CORE-PA) on average revealed only 11-13 cospeciation events but 58-80 pollinator shifts. Probabilistic cophylogenetic analyses (COALA) supported this trend (8-29 cospeciations vs. 40 pollinator shifts). Our study suggests that diversification of Diascia has been largely driven by Rediviva (phylogenetic tracking, pollinator shifts) but not vice versa. Moreover, our data suggest that, even in co-evolving mutualisms, cospeciation events might occur only infrequently.


Assuntos
Abelhas/genética , Evolução Biológica , Especiação Genética , Interações Hospedeiro-Parasita/genética , Polinização/fisiologia , Scrophulariaceae/parasitologia , Animais , Filogenia , Característica Quantitativa Herdável
6.
PLoS One ; 13(9): e0204907, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30265723

RESUMO

For each given pair of (rooted or unrooted) topological trees with the same number of leaves a strict upper bound is shown for the tree partition distance (also called symmetric difference metric and Robinson-Foulds distance)-in case of unrooted trees-and for the cluster distance (also called Robinson-Foulds distance)-in case of rooted trees-of corresponding phylogenetic trees. In particular, it is shown that there exist assignments of labels (e.g., species) to the leaves of both topological tree where each label is assigned to exactly one leaf in each tree such that: i) in the unrooted case, the tree partition distance between the corresponding phylogenetic trees equals the number of internal edges in both trees minus the number of nodes with degree 2 in both trees, ii) in the rooted case, the cluster distance between any two corresponding phylogenetic trees equals the number of internal edges in both trees minus the number of nodes with degree 2 in both trees, and iii) the values in (i) and (ii) are also the maximum values with respect to all possible assignments. The shown strict worst case bounds are needed as normalization factor to compute a normalized version of the respective tree partition metrics.


Assuntos
Modelos Genéticos , Filogenia
7.
J Math Biol ; 77(5): 1459-1491, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-29951855

RESUMO

Two genes are xenologs in the sense of Fitch if they are separated by at least one horizontal gene transfer event. Horizonal gene transfer is asymmetric in the sense that the transferred copy is distinguished from the one that remains within the ancestral lineage. Hence xenology is more precisely thought of as a non-symmetric relation: y is xenologous to x if y has been horizontally transferred at least once since it diverged from the least common ancestor of x and y. We show that xenology relations are characterized by a small set of forbidden induced subgraphs on three vertices. Furthermore, each xenology relation can be derived from a unique least-resolved edge-labeled phylogenetic tree. We provide a linear-time algorithm for the recognition of xenology relations and for the construction of its least-resolved edge-labeled phylogenetic tree. The fact that being a xenology relation is a heritable graph property, finally has far-reaching consequences on approximation problems associated with xenology relations.


Assuntos
Transferência Genética Horizontal , Modelos Genéticos , Família Multigênica , Filogenia , Algoritmos , Simulação por Computador , Duplicação Gênica , Especiação Genética , Heurística , Conceitos Matemáticos
8.
J Virol ; 92(13)2018 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-29695421

RESUMO

The discovery of highly diverse nonprimate hepatoviruses illuminated the evolutionary origins of hepatitis A virus (HAV) ancestors in mammals other than primates. Marsupials are ancient mammals that diverged from other Eutheria during the Jurassic. Viruses from marsupials may thus provide important insight into virus evolution. To investigate Hepatovirus macroevolutionary patterns, we sampled 112 opossums in northeastern Brazil. A novel marsupial HAV (MHAV) in the Brazilian common opossum (Didelphis aurita) was detected by nested reverse transcription-PCR (RT-PCR). MHAV concentration in the liver was high, at 2.5 × 109 RNA copies/g, and at least 300-fold higher than those in other solid organs, suggesting hepatotropism. Hepatovirus seroprevalence in D. aurita was 26.6% as determined using an enzyme-linked immunosorbent assay (ELISA). Endpoint titers in confirmatory immunofluorescence assays were high, and marsupial antibodies colocalized with anti-HAV control sera, suggesting specificity of serological detection and considerable antigenic relatedness between HAV and MHAV. MHAV showed all genomic hallmarks defining hepatoviruses, including late-domain motifs likely involved in quasi-envelope acquisition, a predicted C-terminal pX extension of VP1, strong avoidance of CpG dinucleotides, and a type 3 internal ribosomal entry site. Translated polyprotein gene sequence distances of at least 23.7% from other hepatoviruses suggested that MHAV represents a novel Hepatovirus species. Conserved predicted cleavage sites suggested similarities in polyprotein processing between HAV and MHAV. MHAV was nested within rodent hepatoviruses in phylogenetic reconstructions, suggesting an ancestral hepatovirus host switch from rodents into marsupials. Cophylogenetic reconciliations of host and hepatovirus phylogenies confirmed that host-independent macroevolutionary patterns shaped the phylogenetic relationships of extant hepatoviruses. Although marsupials are synanthropic and consumed as wild game in Brazil, HAV community protective immunity may limit the zoonotic potential of MHAV.IMPORTANCE Hepatitis A virus (HAV) is a ubiquitous cause of acute hepatitis in humans. Recent findings revealed the evolutionary origins of HAV and the genus Hepatovirus defined by HAV in mammals other than primates in general and in small mammals in particular. The factors shaping the genealogy of extant hepatoviruses are unclear. We sampled marsupials, one of the most ancient mammalian lineages, and identified a novel marsupial HAV (MHAV). The novel MHAV shared specific features with HAV, including hepatotropism, antigenicity, genome structure, and a common ancestor in phylogenetic reconstructions. Coevolutionary analyses revealed that host-independent evolutionary patterns contributed most to the current phylogeny of hepatoviruses and that MHAV was the most drastic example of a cross-order host switch of any hepatovirus observed so far. The divergence of marsupials from other mammals offers unique opportunities to investigate HAV species barriers and whether mechanisms of HAV immune control are evolutionarily conserved.


Assuntos
Vírus da Hepatite A/classificação , Fígado/virologia , Marsupiais/virologia , Animais , Anticorpos Antivirais/metabolismo , Brasil , Evolução Molecular , Vírus da Hepatite A/genética , Vírus da Hepatite A/fisiologia , Fígado/imunologia , Marsupiais/imunologia , Filogenia , Proteínas Virais/química , Proteínas Virais/genética , Tropismo Viral
9.
Algorithms Mol Biol ; 13: 2, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29441122

RESUMO

BACKGROUND: In the absence of horizontal gene transfer it is possible to reconstruct the history of gene families from empirically determined orthology relations, which are equivalent to event-labeled gene trees. Knowledge of the event labels considerably simplifies the problem of reconciling a gene tree T with a species trees S, relative to the reconciliation problem without prior knowledge of the event types. It is well-known that optimal reconciliations in the unlabeled case may violate time-consistency and thus are not biologically feasible. Here we investigate the mathematical structure of the event labeled reconciliation problem with horizontal transfer. RESULTS: We investigate the issue of time-consistency for the event-labeled version of the reconciliation problem, provide a convenient axiomatic framework, and derive a complete characterization of time-consistent reconciliations. This characterization depends on certain weak conditions on the event-labeled gene trees that reflect conditions under which evolutionary events are observable at least in principle. We give an [Formula: see text]-time algorithm to decide whether a time-consistent reconciliation map exists. It does not require the construction of explicit timing maps, but relies entirely on the comparably easy task of checking whether a small auxiliary graph is acyclic. The algorithms are implemented in C++ using the boost graph library and are freely available at https://github.com/Nojgaard/tc-recon. SIGNIFICANCE: The combinatorial characterization of time consistency and thus biologically feasible reconciliation is an important step towards the inference of gene family histories with horizontal transfer from orthology data, i.e., without presupposed gene and species trees. The fast algorithm to decide time consistency is useful in a broader context because it constitutes an attractive component for all tools that address tree reconciliation problems.

10.
IEEE/ACM Trans Comput Biol Bioinform ; 15(5): 1585-1593, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-28574364

RESUMO

The weighted Genome Sorting Problem (wGSP) is to find a minimum-weight sequence of rearrangement operations that transforms a given gene order into another given gene order using rearrangement operations that are associated with a predefined weight. This paper presents a polynomial sized Integer Linear Program -called GeRe-ILP- for solving the wGSP for the following three types of rearrangement operations: inversion , transposition, and inverse transposition. GeRe-ILP uses variables and constraints for gene orders of length . It is studied experimentally on simulated data how different weighting schemes influence the reconstructed scenarios. The influences of the length of the gene orders and of the size of the reconstructed scenarios on the runtime of GeRe-ILP are studied as well.


Assuntos
Rearranjo Gênico/genética , Genoma/genética , Genômica/métodos , Modelos Genéticos , Programação Linear , Algoritmos , Software
11.
J Math Biol ; 75(1): 199-237, 2017 07.
Artigo em Inglês | MEDLINE | ID: mdl-27904954

RESUMO

The concepts of orthology, paralogy, and xenology play a key role in molecular evolution. Orthology and paralogy distinguish whether a pair of genes originated by speciation or duplication. The corresponding binary relations on a set of genes form complementary cographs. Allowing more than two types of ancestral event types leads to symmetric symbolic ultrametrics. Horizontal gene transfer, which leads to xenologous gene pairs, however, is inherent asymmetric since one offspring copy "jumps" into another genome, while the other continues to be inherited vertically. We therefore explore here the mathematical structure of the non-symmetric generalization of symbolic ultrametrics. Our main results tie non-symmetric ultrametrics together with di-cographs (the directed generalization of cographs), so-called uniformly non-prime ([Formula: see text]) 2-structures, and hierarchical structures on the set of strong modules. This yields a characterization of relation structures that can be explained in terms of trees and types of ancestral events. This framework accommodates a horizontal-transfer relation in terms of an ancestral event and thus, is slightly different from the the most commonly used definition of xenology. As a first step towards a practical use, we present a simple polynomial-time recognition algorithm of [Formula: see text] 2-structures and investigate the computational complexity of several types of editing problems for [Formula: see text] 2-structures. We show, finally that these NP-complete problems can be solved exactly as Integer Linear Programs.


Assuntos
Evolução Molecular , Modelos Biológicos , Filogenia , Transferência Genética Horizontal
12.
Sci Rep ; 6: 23752, 2016 Apr 12.
Artigo em Inglês | MEDLINE | ID: mdl-27068130

RESUMO

An eco-epidemiological investigation was carried out on Madagascar bat communities to better understand the evolutionary mechanisms and environmental factors that affect virus transmission among bat species in closely related members of the genus Morbillivirus, currently referred to as Unclassified Morbilli-related paramyxoviruses (UMRVs). A total of 947 bats were investigated originating from 52 capture sites (22 caves, 18 buildings, and 12 outdoor sites) distributed over different bioclimatic zones of the island. Using RT-PCR targeting the L-polymerase gene of the Paramyxoviridae family, we found that 10.5% of sampled bats were infected, representing six out of seven families and 15 out of 31 species analyzed. Univariate analysis indicates that both abiotic and biotic factors may promote viral infection. Using generalized linear modeling of UMRV infection overlaid on biotic and abiotic variables, we demonstrate that sympatric occurrence of bats is a major factor for virus transmission. Phylogenetic analyses revealed that all paramyxoviruses infecting Malagasy bats are UMRVs and showed little host specificity. Analyses using the maximum parsimony reconciliation tool CoRe-PA, indicate that host-switching, rather than co-speciation, is the dominant macro-evolutionary mechanism of UMRVs among Malagasy bats.


Assuntos
Evolução Biológica , Especificidade de Hospedeiro , Infecções por Paramyxoviridae/veterinária , Paramyxoviridae/classificação , Paramyxoviridae/fisiologia , Tropismo Viral , Animais , Quirópteros , Estudos Epidemiológicos , Genótipo , Madagáscar/epidemiologia , Paramyxoviridae/genética , Paramyxoviridae/isolamento & purificação , Infecções por Paramyxoviridae/epidemiologia , Infecções por Paramyxoviridae/virologia , Filogenia , Prevalência , RNA Viral/genética , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Análise de Sequência de DNA
13.
Algorithms Mol Biol ; 11: 1, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-26913054

RESUMO

BACKGROUND: The accurate annotation of genes in newly sequenced genomes remains a challenge. Although sophisticated comparative pipelines are available, computationally derived gene models are often less than perfect. This is particularly true when multiple similar paralogs are present. The issue is aggravated further when genomes are assembled only at a preliminary draft level to contigs or short scaffolds. However, these genomes deliver valuable information for studying gene families. High accuracy models of protein coding genes are needed in particular for phylogenetics and for the analysis of gene family histories. RESULTS: We present a pipeline, ExonMatchSolver, that is designed to help the user to produce and curate high quality models of the protein-coding part of genes. The tool in particular tackles the problem of identifying those coding exon groups that belong to the same paralogous genes in a fragmented genome assembly. This paralog-to-contig assignment problem is shown to be NP-complete. It is phrased and solved as an Integer Linear Programming problem. CONCLUSIONS: The ExonMatchSolver-pipeline can be employed to build highly accurate models of protein coding genes even when spanning several genomic fragments. This sets the stage for a better understanding of the evolutionary history within particular gene families which possess a large number of paralogs and in which frequent gene duplication events occurred.

14.
FEMS Microbiol Ecol ; 92(4): fiw037, 2016 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-26902801

RESUMO

Pathogenic Leptospira are the causative agents of leptospirosis, a disease of global concern with major impact in tropical regions. Despite the importance of this zoonosis for human health, the evolutionary and ecological drivers shaping bacterial communities in host reservoirs remain poorly investigated. Here, we describe Leptospira communities hosted by Malagasy bats, composed of mostly endemic species, in order to characterize host-pathogen associations and investigate their evolutionary histories. We screened 947 individual bats (representing 31 species, 18 genera and seven families) for Leptospira infection and subsequently genotyped positive samples using three different bacterial loci. Molecular identification showed that these Leptospira are notably diverse and include several distinct lineages mostly belonging to Leptospira borgpetersenii and L. kirschneri. The exploration of the most probable host-pathogen evolutionary scenarios suggests that bacterial genetic diversity results from a combination of events related to the ecology and the evolutionary history of their hosts. Importantly, based on the data set presented herein, the notable host-specificity we have uncovered, together with a lack of geographical structuration of bacterial genetic diversity, indicates that the Leptospira community at a given site depends on the co-occurring bat species assemblage. The implications of such tight host-specificity on the epidemiology of leptospirosis are discussed.


Assuntos
Quirópteros/microbiologia , Variação Genética/genética , Interações Hospedeiro-Patógeno/fisiologia , Leptospira/genética , Animais , Evolução Biológica , Genótipo , Especificidade de Hospedeiro , Humanos , Leptospira/patogenicidade , Leptospirose/microbiologia , Madagáscar , Dados de Sequência Molecular , Filogenia
15.
Artigo em Inglês | MEDLINE | ID: mdl-26671795

RESUMO

In this paper, we present an integer linear programming (ILP) approach, called CoRe-ILP, for finding an optimal time consistent cophylogenetic host-parasite reconciliation under the cophylogenetic event model with the events cospeciation, duplication, sorting, host switch, and failure to diverge. Instead of assuming event costs, a simplified model is used, maximizing primarily for cospeciations and secondarily minimizing host switching events. Duplications, sortings, and failure to diverge events are not explicitly scored. Different from existing event based reconciliation methods, CoRe-ILP can use (approximate) phylogenetic branch lengths for filtering possible ancestral host-parasite interactions. Experimentally, it is shown that CoRe-ILP can successfully use branch length information and performs well for biological and simulated data sets. The results of CoRe-ILP are compared with the results of the reconciliation tools Jane 4, Treemap 3b, NOTUNG 2.8 Beta, and Ranger-DTL. Algorithm CoRe-ILP is implemented using IBM ILOG CPLEX Optimizer 12.6 and is freely available from http://pacosy.informatik.uni-leipzig.de/core-ilp.


Assuntos
Algoritmos , Evolução Molecular , Geômis/genética , Interações Hospedeiro-Parasita/genética , Modelos Genéticos , Ftirápteros/genética , Animais , Simulação por Computador , Genética Populacional , Geômis/parasitologia , Humanos , Linhagem , Filogenia , Programação Linear
16.
Proc Natl Acad Sci U S A ; 112(7): 2058-63, 2015 Feb 17.
Artigo em Inglês | MEDLINE | ID: mdl-25646426

RESUMO

Phylogenomics heavily relies on well-curated sequence data sets that comprise, for each gene, exclusively 1:1 orthologos. Paralogs are treated as a dangerous nuisance that has to be detected and removed. We show here that this severe restriction of the data sets is not necessary. Building upon recent advances in mathematical phylogenetics, we demonstrate that gene duplications convey meaningful phylogenetic information and allow the inference of plausible phylogenetic trees, provided orthologs and paralogs can be distinguished with a degree of certainty. Starting from tree-free estimates of orthology, cograph editing can sufficiently reduce the noise to find correct event-annotated gene trees. The information of gene trees can then directly be translated into constraints on the species trees. Although the resolution is very poor for individual gene families, we show that genome-wide data sets are sufficient to generate fully resolved phylogenetic trees, even in the presence of horizontal gene transfer.


Assuntos
Genômica , Filogenia
17.
PLoS One ; 9(8): e105015, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25137074

RESUMO

The elucidation of orthology relationships is an important step both in gene function prediction as well as towards understanding patterns of sequence evolution. Orthology assignments are usually derived directly from sequence similarities for large data because more exact approaches exhibit too high computational costs. Here we present PoFF, an extension for the standalone tool Proteinortho, which enhances orthology detection by combining clustering, sequence similarity, and synteny. In the course of this work, FFAdj-MCS, a heuristic that assesses pairwise gene order using adjacencies (a similarity measure related to the breakpoint distance) was adapted to support multiple linear chromosomes and extended to detect duplicated regions. PoFF largely reduces the number of false positives and enables more fine-grained predictions than purely similarity-based approaches. The extension maintains the low memory requirements and the efficient concurrency options of its basis Proteinortho, making the software applicable to very large datasets.


Assuntos
Modelos Genéticos , Software , Sintenia , Proteínas de Bactérias/genética , Análise por Conglomerados , Simulação por Computador , Conjuntos de Dados como Assunto , Genes Bacterianos
18.
BMC Genomics ; 15: 522, 2014 Jun 25.
Artigo em Inglês | MEDLINE | ID: mdl-24965762

RESUMO

BACKGROUND: The Aquificales are a diverse group of thermophilic bacteria that thrive in terrestrial and marine hydrothermal environments. They can be divided into the families Aquificaceae, Desulfurobacteriaceae and Hydrogenothermaceae. Although eleven fully sequenced and assembled genomes are available, only little is known about this taxonomic order in terms of RNA metabolism. RESULTS: In this work, we compare the available genomes, extend their protein annotation, identify regulatory sequences, annotate non-coding RNAs (ncRNAs) of known function, predict novel ncRNA candidates, show idiosyncrasies of the genetic decoding machinery, present two different types of transfer-messenger RNAs and variations of the CRISPR systems. Furthermore, we performed a phylogenetic analysis of the Aquificales based on entire genome sequences, and extended this by a classification among all bacteria using 16S rRNA sequences and a set of orthologous proteins.Combining several in silico features (e.g. conserved and stable secondary structures, GC-content, comparison based on multiple genome alignments) with an in vivo dRNA-seq transcriptome analysis of Aquifex aeolicus, we predict roughly 100 novel ncRNA candidates in this bacterium. CONCLUSIONS: We have here re-analyzed the Aquificales, a group of bacteria thriving in extreme environments, sharing the feature of a small, compact genome with a reduced number of protein and ncRNA genes. We present several classical ncRNAs and riboswitch candidates. By combining in silico analysis with dRNA-seq data of A. aeolicus we predict nearly 100 novel ncRNA candidates.


Assuntos
Genoma Bacteriano , Bactérias Gram-Positivas/genética , RNA não Traduzido/genética , Sequência de Bases , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas/genética , Bases de Dados Genéticas , Digoxigenina/química , Bactérias Gram-Positivas/classificação , Conformação de Ácido Nucleico , Oligonucleotídeos/química , Oligonucleotídeos/metabolismo , Filogenia , RNA Bacteriano/química , RNA Bacteriano/metabolismo , RNA Ribossômico 16S/química , RNA Ribossômico 16S/genética , RNA de Transferência/metabolismo , RNA não Traduzido/química , RNA não Traduzido/metabolismo , Ribonuclease P/metabolismo , Análise de Sequência de RNA
19.
J Math Biol ; 66(1-2): 399-420, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22456957

RESUMO

Orthology detection is an important problem in comparative and evolutionary genomics and, consequently, a variety of orthology detection methods have been devised in recent years. Although many of these methods are dependent on generating gene and/or species trees, it has been shown that orthology can be estimated at acceptable levels of accuracy without having to infer gene trees and/or reconciling gene trees with species trees. Thus, it is of interest to understand how much information about the gene tree, the species tree, and their reconciliation is already contained in the orthology relation on the underlying set of genes. Here we shall show that a result by Böcker and Dress concerning symbolic ultrametrics, and subsequent algorithmic results by Semple and Steel for processing these structures can throw a considerable amount of light on this problem. More specifically, building upon these authors' results, we present some new characterizations for symbolic ultrametrics and new algorithms for recovering the associated trees, with an emphasis on how these algorithms could be potentially extended to deal with arbitrary orthology relations. In so doing we shall also show that, somewhat surprisingly, symbolic ultrametrics are very closely related to cographs, graphs that do not contain an induced path on any subset of four vertices. We conclude with a discussion on how our results might be applied in practice to orthology detection.


Assuntos
Modelos Genéticos , Filogenia , Algoritmos , Animais , Biologia Computacional , Evolução Molecular , Genômica , Humanos , Conceitos Matemáticos
20.
Virology ; 423(1): 68-76, 2012 Feb 05.
Artigo em Inglês | MEDLINE | ID: mdl-22189211

RESUMO

We determined the complete genome sequences of Tribec virus (TRBV) and Kemerovo virus (KEMV), two tick-transmitted Orbiviruses that can cause diseases of the central nervous system and that are currently classified into the Great Island virus serogroup. VP2 proteins of TRBV and KEMV show very low sequence similarity to the homologous VP4 protein of tick-transmitted Great Island virus (GIV). The new sequence data support previous serological classification of these Orbiviruses into the Kemerovo serogroup, which is different from the Great Island virus serogroup. Genome segment 9 of TRBV and KEMV encodes several overlapping ORF's in the +1 reading frame relative to VP6(Hel). A co-phylogenetic analysis indicates a host switch from insect-borne Orbiviruses toward Ixodes species, which is in disagreement with previously published data.


Assuntos
Vetores Aracnídeos/virologia , Orbivirus/genética , Orbivirus/isolamento & purificação , Infecções por Reoviridae/virologia , Carrapatos/virologia , Sequência de Aminoácidos , Animais , Sequência de Bases , Linhagem Celular , Evolução Molecular , Genoma Viral , Humanos , Dados de Sequência Molecular , Orbivirus/química , Orbivirus/classificação , Filogenia , Alinhamento de Sequência , Proteínas Virais/química , Proteínas Virais/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...