ABSTRACT
The Andes mountains of western South America are a globally important biodiversity hotspot, yet there is a paucity of resolved phylogenies for plant clades from this region. Filling an important gap in our understanding of the World's richest flora, we present the first phylogeny of Freziera (Pentaphylacaceae), an Andean-centered, cloud forest radiation. Our dataset was obtained via hybrid-enriched target sequence capture of Angiosperms353 universal loci for 50 of the ca. 75 spp., obtained almost entirely from herbarium specimens. We identify high phylogenomic complexity in Freziera, including the presence of data artifacts. Via by-eye observation of gene trees, detailed examination of warnings from recently improved assembly pipelines, and gene tree filtering, we identified that artifactual orthologs (i.e., the presence of only one copy of a multicopy gene due to differential assembly) were an important source of gene tree heterogeneity that had a negative impact on phylogenetic inference and support. These artifactual orthologs may be common in plant phylogenomic datasets, where multiple instances of genome duplication are common. After accounting for artifactual orthologs as source of gene tree error, we identified a significant, but nonspecific signal of introgression using Patterson's D and f4 statistics. Despite phylogenomic complexity, we were able to resolve Freziera into 9 well-supported subclades whose evolution has been shaped by multiple evolutionary processes, including incomplete lineage sorting, historical gene flow, and gene duplication. Our results highlight the complexities of plant phylogenomics, which are heightened in Andean radiations, and show the impact of filtering data processing artifacts and standard filtering approaches on phylogenetic inference.
Subject(s)
Phylogeny , Classification/methods , South America , Genome, PlantABSTRACT
The economically important cotton and cacao family (Malvaceae sensu lato) have long been recognized as a monophyletic group. However, the relationships among some subfamilies are still unclear as discordant phylogenetic hypotheses keep arising when different sources of molecular data are analyzed. Phylogenetic discordance has previously been hypothesized to be the result of both introgression and incomplete lineage sorting (ILS), but the extent and source of discordance have not yet been evaluated in the context of loci derived from massive sequencing strategies and for a wide representation of the family. Furthermore, no formal methods have been applied to evaluate if the detected phylogenetic discordance among phylogenomic datasets influences phylogenetic dating estimates of the concordant relationships. The objective of this research was to generate a phylogenetic hypothesis of Malvaceae from nuclear genes, specifically we aimed to (1) investigate the presence of major discordance among hundreds of nuclear gene histories of Malvaceae; (2) evaluate the potential source of discordance; and (3) examine whether discordance and loci heterogeneity influence on time estimates of the origin and diversification of subfamilies. Our study is based on a comprehensive dataset representing 96 genera of the nine subfamilies and 268 nuclear loci. Both concatenated and coalescence-based approaches were followed for phylogenetic inference. Using branch lengths and topology, we located the placement of introgression events to directly evaluate whether discordance is due to introgression rather than ILS. To estimate divergence times, concordance and molecular rate were considered. We filtered loci based on congruence with the species tree and then obtained the molecular rate of each locus to distribute them into three different sets corresponding to shared molecular rate ranges. Bayesian dating was performed for each of the different sets of loci with the same parameters and calibrations. Phylogenomic discordance was detected between methods, as well as gene histories. At deep coalescent times, we found discordance in the position of five subclades probably due to ILS and a relatively small proportion of introgression. Divergence time estimation with each set of loci generated overlapping clade ages, indicating that, even with different molecular rate and gene histories, calibrations generally provide a strong prior.
ABSTRACT
Abstract Austrodiplostomum spp. (Platyhelminthes: Digenea) are endoparasites with a broad geographic distribution in South America. During the larval stage, they parasitize the eyes, brains, muscles, gill, kidneys and swim bladder of a wide variety of fishes. The metacercariae of Austrodiplostomum spp. have several morphological characteristics during development, but are very similar among species, which makes it necessary to use molecular tools to contribute to the elucidation during the larval stage. The objective of this study was to perform morphological and molecular analyses of Austrodiplostomum sp. found in specimens of Hypostomus sourced from the Ivaí River in the state of Paraná, Brazil. Of the 93 analyzed specimens (H. hermanni [n = 50], H. albopunctatus [n = 9], Hypostomus sp. 1 [n = 24], and Hypostomus sp. 2 [n = 10]), 60 were parasitized. A total of 577 Austrodiplostomum sp. metacercariae was collected from the infected hosts; DNA from seven of these samples was extracted, amplified, and sequenced. The morphological data associated with the genetic distance values and the relationships observed in the COI gene tree, indicate that all metacercariae were A. compactum. This is the first record of A. compactum parasitizing H. hermanni, H. albopunctatus, Hypostomus sp. 1, and Hypostomus sp. 2 in the Ivaí River.
Resumo Austrodiplostomum spp. (Platyhelminthes: Digenea) são endoparasitos com uma ampla distribuição geográfica na América do Sul. Durante a fase larval, parasitam os olhos, cérebros, músculos, brânquias, rins e bexiga natatória de uma grande variedade de peixes. As metacercárias de Austrodiplostomum spp. apresentam várias características morfológicas durante o desenvolvimento, as quais são muito semelhantes entre as espécies, o que torna necessário o uso de ferramentas moleculares para contribuir para a elucidação durante a fase larval. O objetivo deste estudo foi realizar análises morfológicas e moleculares de Austrodiplostomum sp. encontradas em espécimes de Hypostomus provenientes do rio Ivaí, no Paraná, Brasil. Dos 93 espécimes analisados (H. hermanni [n = 50], H. albopunctatus [n = 9], Hypostomus sp. 1 [n = 24], e Hypostomus sp. 2 [n = 10]), 60 foram parasitados. Um total de 577 metacercárias de Austrodiplostomum foram coletadas dos hospedeiros infectados; o DNA de sete dessas amostras foi extraído, amplificado e sequenciado. Os dados morfológicos, associados aos valores de distância genética e as relações observadas na árvore gênica do COI, indicam que todas as metacercárias são A. compactum. Este é o primeiro registo de A. compactum parasitando H. hermanni, H. albopunctatus, Hypostomus sp. 1, e Hypostomus sp. 2 no rio Ivaí.
Subject(s)
Animals , Trematoda/anatomy & histology , Trematoda/genetics , Catfishes , Fish Diseases/parasitology , Brain/parasitology , Brazil , Rivers , Metacercariae/geneticsABSTRACT
Animals depend on the sequential oxidation of organic molecules to survive; thus, oxygen-carrying/transporting proteins play a fundamental role in aerobic metabolism. Globins are the most common and widespread group of respiratory proteins. They can be divided into three types: circulating intracellular, noncirculating intracellular, and extracellular, all of which have been reported in annelids. The diversity of oxygen transport proteins has been underestimated across metazoans. We probed 250 annelid transcriptomes in search of globin diversity in order to elucidate the evolutionary history of this gene family within this phylum. We report two new globin types in annelids, namely androglobins and cytoglobins. Although cytoglobins and myoglobins from vertebrates and from invertebrates are referred to by the same name, our data show they are not genuine orthologs. Our phylogenetic analyses show that extracellular globins from annelids are more closely related to extracellular globins from other metazoans than to the intracellular globins of annelids. Broadly, our findings indicate that multiple gene duplication and neo-functionalization events shaped the evolutionary history of the globin family.
Subject(s)
Annelida/genetics , Evolution, Molecular , Globins/genetics , Multigene Family , Amino Acid Sequence , Animals , Annelida/chemistry , Gene Duplication , Globins/chemistry , PhylogenyABSTRACT
Multicellular organisms depend on oxygen-carrying proteins to transport oxygen throughout the body; therefore, proteins such as hemoglobins (Hbs), hemocyanins, and hemerythrins are essential for maintenance of tissues and cellular respiration. Vertebrate Hbs are among the most extensively studied proteins; however, much less is known about invertebrate Hbs. Recent studies of hemocyanins and hemerythrins have demonstrated that they have much wider distributions than previously thought, suggesting that oxygen-binding protein diversity is underestimated across metazoans. Hexagonal bilayer hemoglobin (HBL-Hb), a blood pigment found exclusively in annelids, is a polymer comprised up to 144 extracellular globins and 36 linker chains. To further understand the evolutionary history of this protein complex, we explored the diversity of linkers and extracellular globins from HBL-Hbs using in silico approaches on 319 metazoan and one choanoflagellate transcriptomes. We found 559 extracellular globin and 414 linker genes transcribed in 171 species from ten animal phyla with new records in Echinodermata, Hemichordata, Brachiopoda, Mollusca, Nemertea, Bryozoa, Phoronida, Platyhelminthes, and Priapulida. Contrary to previous suggestions that linkers and extracellular globins emerged in the annelid ancestor, our findings indicate that they have putatively emerged before the protostome-deuterostome split. For the first time, we unveiled the comprehensive evolutionary history of metazoan HBL-Hb components, which consists of multiple episodes of gene gains and losses. Moreover, because our study design surveyed linkers and extracellular globins independently, we were able to cross-validate our results, significantly reducing the rate of false positives. We confirmed that the distribution of HBL-Hb components has until now been underestimated among animals.
Subject(s)
Globins/genetics , Invertebrates/genetics , Phylogeny , AnimalsABSTRACT
Phylogenomics aims at reconstructing the evolutionary histories of organisms taking into account whole genomes or large fractions of genomes. The abundance of genomic data for an enormous variety of organisms has enabled phylogenomic inference of many groups, and this has motivated the development of many computer programs implementing the associated methods. This chapter surveys phylogenetic concepts and methods aimed at both gene tree and species tree reconstruction while also addressing common pitfalls, providing references to relevant computer programs. A practical phylogenomic analysis example including bacterial genomes is presented at the end of the chapter.
Subject(s)
Evolution, Molecular , Genome, Bacterial , Genomics/methods , Phylogeny , Software , INDEL Mutation , Models, Genetic , Polymorphism, Single NucleotideABSTRACT
Gene turnover rates and the evolution of gene family sizes are important aspects of genome evolution. Here, we use curated sequence data of the major chemosensory gene families from Drosophila-the gustatory receptor, odorant receptor, ionotropic receptor, and odorant-binding protein families-to conduct a comparative analysis among families, exploring different methods to estimate gene birth and death rates, including an ad hoc simulation study. Remarkably, we found that the state-of-the-art methods may produce very different rate estimates, which may lead to disparate conclusions regarding the evolution of chemosensory gene family sizes in Drosophila. Among biological factors, we found that a peculiarity of D. sechellia's gene turnover rates was a major source of bias in global estimates, whereas gene conversion had negligible effects for the families analyzed herein. Turnover rates vary considerably among families, subfamilies, and ortholog groups although all analyzed families were quite dynamic in terms of gene turnover. Computer simulations showed that the methods that use ortholog group information appear to be the most accurate for the Drosophila chemosensory families. Most importantly, these results reveal the potential of rate heterogeneity among lineages to severely bias some turnover rate estimation methods and the need of further evaluating the performance of these methods in a more diverse sampling of gene families and phylogenetic contexts. Using branch-specific codon substitution models, we find further evidence of positive selection in recently duplicated genes, which attests to a nonneutral aspect of the gene birth-and-death process.
Subject(s)
Drosophila/genetics , Evolution, Molecular , Genetic Techniques/standards , Receptors, Odorant/genetics , Animals , Computer Simulation , Gene Duplication , HumansABSTRACT
Phylogenetic analysis based on multi-loci data sets is performed by means of supermatrix (SM) or supertree (ST) approaches. Recently, methods that rely on species tree (SppT) inference by the multi-species coalescence have also been implemented to tackle this problem. Generally, the relative performance of these three major strategies has been calculated using simulation of biological sequences. However, sequence simulation may not entirely replicate the complexity of the evolutionary process. Thus, issues regarding the usefulness of in silico sequences in studying the performance of phylogenetic methods have been raised. Here, we used both classical simulation and empirical data to investigate the relative performance of ST, SM, and the SppT methods. SM analyses performed better than the ST and SppTs in simulations, but not in empirical analyses where some ST methods significantly outperformed the others. Additionally, SM was the only method that was robust under evolutionary model violations in simulations. These results show that conventional biological sequence simulation cannot adequately resolve which method is most efficient to recover the SppT. In such simulations, the SM approach recovers the established phylogeny in most instances, whereas the performance of the ST and SppT methods is downgraded in simpler cases. When compared, the analyses based on empirical and simulated sequences yielded largely inconsistent results, with the latter showing a bias towards a seemingly superiority of SM approaches.