Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 38
Filtrar
1.
J Evol Biol ; 36(10): 1525-1538, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37776088

RESUMO

Populations suffer two types of stochasticity: demographic stochasticity, from sampling error in offspring number, and environmental stochasticity, from temporal variation in the growth rate. By modelling evolution through phenotypic selection following an abrupt environmental change, we investigate how genetic and demographic dynamics, as well as effects on population survival of the genetic variance and of the strength of stabilizing selection, differ under the two types of stochasticity. We show that population survival probability declines sharply with stronger stabilizing selection under demographic stochasticity, but declines more continuously when environmental stochasticity is strengthened. However, the genetic variance that confers the highest population survival probability differs little under demographic and environmental stochasticity. Since the influence of demographic stochasticity is stronger when population size is smaller, a slow initial decline of genetic variance, which allows quicker evolution, is important for population persistence. In contrast, the influence of environmental stochasticity is population-size-independent, so higher initial fitness becomes important for survival under strong environmental stochasticity. The two types of stochasticity interact in a more than multiplicative way in reducing the population survival probability. Our work suggests the importance of explicitly distinguishing and measuring the forms of stochasticity during evolutionary rescue.

2.
Syst Biol ; 71(6): 1290-1306, 2022 10 12.
Artigo em Inglês | MEDLINE | ID: mdl-35285502

RESUMO

Morphology remains a primary source of phylogenetic information for many groups of organisms, and the only one for most fossil taxa. Organismal anatomy is not a collection of randomly assembled and independent "parts", but instead a set of dependent and hierarchically nested entities resulting from ontogeny and phylogeny. How do we make sense of these dependent and at times redundant characters? One promising approach is using ontologies-structured controlled vocabularies that summarize knowledge about different properties of anatomical entities, including developmental and structural dependencies. Here, we assess whether evolutionary patterns can explain the proximity of ontology-annotated characters within an ontology. To do so, we measure phylogenetic information across characters and evaluate if it matches the hierarchical structure given by ontological knowledge-in much the same way as across-species diversity structure is given by phylogeny. We implement an approach to evaluate the Bayesian phylogenetic information (BPI) content and phylogenetic dissonance among ontology-annotated anatomical data subsets. We applied this to data sets representing two disparate animal groups: bees (Hexapoda: Hymenoptera: Apoidea, 209 chars) and characiform fishes (Actinopterygii: Ostariophysi: Characiformes, 463 chars). For bees, we find that BPI is not substantially explained by anatomy since dissonance is often high among morphologically related anatomical entities. For fishes, we find substantial information for two clusters of anatomical entities instantiating concepts from the jaws and branchial arch bones, but among-subset information decreases and dissonance increases substantially moving to higher-level subsets in the ontology. We further applied our approach to address particular evolutionary hypotheses with an example of morphological evolution in miniature fishes. While we show that phylogenetic information does match ontology structure for some anatomical entities, additional relationships and processes, such as convergence, likely play a substantial role in explaining BPI and dissonance, and merit future investigation. Our work demonstrates how complex morphological data sets can be interrogated with ontologies by allowing one to access how information is spread hierarchically across anatomical concepts, how congruent this information is, and what sorts of processes may play a role in explaining it: phylogeny, development, or convergence. [Apidae; Bayesian phylogenetic information; Ostariophysi; Phenoscape; phylogenetic dissonance; semantic similarity.].


Assuntos
Artrópodes , Caraciformes , Animais , Teorema de Bayes , Fósseis , Filogenia
3.
Res Policy ; 50(1): 104069, 2021 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-33390628

RESUMO

Synthesis centers are a form of scientific organization that catalyzes and supports research that integrates diverse theories, methods and data across spatial or temporal scales to increase the generality, parsimony, applicability, or empirical soundness of scientific explanations. Synthesis working groups are a distinctive form of scientific collaboration that produce consequential, high-impact publications. But no one has asked if synthesis working groups synthesize: are their publications substantially more diverse than others, and if so, in what ways and with what effect? We investigate these questions by using Latent Dirichlet Analysis to compare the topical diversity of papers published by synthesis center collaborations with that of papers in a reference corpus. Topical diversity was operationalized and measured in several ways, both to reflect aggregate diversity and to emphasize particular aspects of diversity (such as variety, evenness, and balance). Synthesis center publications have greater topical variety and evenness, but less disparity, than do papers in the reference corpus. The influence of synthesis center origins on aspects of diversity is only partly mediated by the size and heterogeneity of collaborations: when taking into account the numbers of authors, distinct institutions, and references, synthesis center origins retain a significant direct effect on diversity measures. Controlling for the size and heterogeneity of collaborative groups, synthesis center origins and diversity measures significantly influence the visibility of publications, as indicated by citation measures. We conclude by suggesting social processes within collaborations that might account for the observed effects, by inviting further exploration of what this novel textual analysis approach might reveal about interdisciplinary research, and by offering some practical implications of our results.

4.
Evolution ; 74(8): 1590-1602, 2020 08.
Artigo em Inglês | MEDLINE | ID: mdl-32267552

RESUMO

The role of genetic architecture in adaptation to novel environments has received considerable attention when the source of adaptive variation is de novo mutation. Relatively less is known when the source of adaptive variation is inter- or intraspecific hybridization. We model hybridization between divergent source populations and subsequent colonization of an unoccupied novel environment using individual-based simulations to understand the influence of genetic architecture on the timing of colonization and the mode of adaptation. We find that two distinct categories of genetic architecture facilitate rapid colonization but that they do so in qualitatively different ways. For few and/or tightly linked loci, the mode of adaptation is via the recovery of adaptive parental genotypes. With many unlinked loci, the mode of adaptation is via the generation of novel hybrid genotypes. The first category results in the shortest colonization lag phases across the widest range of parameter space, but further adaptation is mutation limited. The second category takes longer and is more sensitive to genetic variance and dispersal rate, but can facilitate adaptation to environmental conditions that exceed the tolerance of parental populations. These findings have implications for understanding the origins of biological invasions and the success of hybrid populations.


Assuntos
Hibridização Genética , Espécies Introduzidas , Modelos Genéticos , Epistasia Genética , Ligação Genética
5.
Syst Biol ; 69(2): 345-362, 2020 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-31596473

RESUMO

There is a growing body of research on the evolution of anatomy in a wide variety of organisms. Discoveries in this field could be greatly accelerated by computational methods and resources that enable these findings to be compared across different studies and different organisms and linked with the genes responsible for anatomical modifications. Homology is a key concept in comparative anatomy; two important types are historical homology (the similarity of organisms due to common ancestry) and serial homology (the similarity of repeated structures within an organism). We explored how to most effectively represent historical and serial homology across anatomical structures to facilitate computational reasoning. We assembled a collection of homology assertions from the literature with a set of taxon phenotypes for the skeletal elements of vertebrate fins and limbs from the Phenoscape Knowledgebase. Using seven competency questions, we evaluated the reasoning ramifications of two logical models: the Reciprocal Existential Axioms (REA) homology model and the Ancestral Value Axioms (AVA) homology model. The AVA model returned all user-expected results in addition to the search term and any of its subclasses. The AVA model also returns any superclass of the query term in which a homology relationship has been asserted. The REA model returned the user-expected results for five out of seven queries. We identify some challenges of implementing complete homology queries due to limitations of OWL reasoning. This work lays the foundation for homology reasoning to be incorporated into other ontology-based tools, such as those that enable synthetic supermatrix construction and candidate gene discovery. [Homology; ontology; anatomy; morphology; evolution; knowledgebase; phenoscape.].


Assuntos
Classificação/métodos , Modelos Biológicos , Nadadeiras de Animais/anatomia & histologia , Animais , Extremidades/anatomia & histologia , Vertebrados/anatomia & histologia
6.
PeerJ Comput Sci ; 5: e234, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-33816887

RESUMO

Conferences with contributed talks grouped into multiple concurrent sessions pose an interesting scheduling problem. From an attendee's perspective, choosing which talks to visit when there are many concurrent sessions is challenging since an individual may be interested in topics that are discussed in different sessions simultaneously. The frequency of topically similar talks in different concurrent sessions is, in fact, a common cause for complaint in post-conference surveys. Here, we introduce a practical solution to the conference scheduling problem by heuristic optimization of an objective function that weighs the occurrence of both topically similar talks in one session and topically different talks in concurrent sessions. Rather than clustering talks based on a limited number of preconceived topics, we employ a topic model to allow the topics to naturally emerge from the corpus of contributed talk titles and abstracts. We then measure the topical distance between all pairs of talks. Heuristic optimization of preliminary schedules seeks to balance the topical similarity of talks within a session and the dissimilarity between concurrent sessions. Using an ecology conference as a test case, we find that stochastic optimization dramatically improves the objective function relative to the schedule manually produced by the program committee. Approximate Integer Linear Programming can be used to provide a partially-optimized starting schedule, but the final value of the discrimination ratio (an objective function used to estimate coherence within a session and disparity between concurrent sessions) is surprisingly insensitive to the starting schedule. Furthermore, we show that, in contrast to the manual process, arbitrary scheduling constraints are straightforward to include. We applied our method to a second biology conference with over 1,000 contributed talks plus scheduling constraints. In a randomized experiment, biologists responded similarly to a machine-optimized schedule and a highly modified schedule produced by domain experts on the conference program committee.

7.
Pac Symp Biocomput ; 21: 132-43, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-26776180

RESUMO

There is growing use of ontologies for the measurement of cross-species phenotype similarity. Such similarity measurements contribute to diverse applications, such as identifying genetic models for human diseases, transferring knowledge among model organisms, and studying the genetic basis of evolutionary innovations. Two organismal features, whether genes, anatomical parts, or any other inherited feature, are considered to be homologous when they are evolutionarily derived from a single feature in a common ancestor. A classic example is the homology between the paired fins of fishes and vertebrate limbs. Anatomical ontologies that model the structural relations among parts may fail to include some known anatomical homologies unless they are deliberately added as separate axioms. The consequences of neglecting known homologies for applications that rely on such ontologies has not been well studied. Here, we examine how semantic similarity is affected when external homology knowledge is included. We measure phenotypic similarity between orthologous and non-orthologous gene pairs between humans and either mouse or zebrafish, and compare the inclusion of real with faux homology axioms. Semantic similarity was preferentially increased for orthologs when using real homology axioms, but only in the more divergent of the two species comparisons (human to zebrafish, not human to mouse), and the relative increase was less than 1% to non-orthologs. By contrast, inclusion of both real and faux random homology axioms preferentially increased similarities between genes that were initially more dissimilar in the other comparisons. Biologically meaningful increases in semantic similarity were seen for a select subset of gene pairs. Overall, the effect of including homology axioms on cross-species semantic similarity was modest at the levels of divergence examined here, but our results hint that it may be greater for more distant species comparisons.


Assuntos
Anatomia Comparada/métodos , Anatomia Comparada/estatística & dados numéricos , Animais , Biologia Computacional/métodos , Biologia Computacional/estatística & dados numéricos , Evolução Molecular , Humanos , Camundongos , Fenótipo , Semântica , Homologia de Sequência do Ácido Nucleico , Especificidade da Espécie , Integração de Sistemas , Peixe-Zebra/anatomia & histologia , Peixe-Zebra/genética
8.
Mol Biol Evol ; 33(1): 13-24, 2016 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-26500251

RESUMO

Phenotypes resulting from mutations in genetic model organisms can help reveal candidate genes for evolutionarily important phenotypic changes in related taxa. Although testing candidate gene hypotheses experimentally in nonmodel organisms is typically difficult, ontology-driven information systems can help generate testable hypotheses about developmental processes in experimentally tractable organisms. Here, we tested candidate gene hypotheses suggested by expert use of the Phenoscape Knowledgebase, specifically looking for genes that are candidates responsible for evolutionarily interesting phenotypes in the ostariophysan fishes that bear resemblance to mutant phenotypes in zebrafish. For this, we searched ZFIN for genetic perturbations that result in either loss of basihyal element or loss of scales phenotypes, because these are the ancestral phenotypes observed in catfishes (Siluriformes). We tested the identified candidate genes by examining their endogenous expression patterns in the channel catfish, Ictalurus punctatus. The experimental results were consistent with the hypotheses that these features evolved through disruption in developmental pathways at, or upstream of, brpf1 and eda/edar for the ancestral losses of basihyal element and scales, respectively. These results demonstrate that ontological annotations of the phenotypic effects of genetic alterations in model organisms, when aggregated within a knowledgebase, can be used effectively to generate testable, and useful, hypotheses about evolutionary changes in morphology.


Assuntos
Peixes-Gato/genética , Evolução Molecular , Expressão Gênica , Modelos Genéticos , Fenótipo , Animais , Biologia Computacional , Expressão Gênica/genética , Expressão Gênica/fisiologia , Software
9.
Genesis ; 53(8): 561-71, 2015 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-26220875

RESUMO

The abundance of phenotypic diversity among species can enrich our knowledge of development and genetics beyond the limits of variation that can be observed in model organisms. The Phenoscape Knowledgebase (KB) is designed to enable exploration and discovery of phenotypic variation among species. Because phenotypes in the KB are annotated using standard ontologies, evolutionary phenotypes can be compared with phenotypes from genetic perturbations in model organisms. To illustrate the power of this approach, we review the use of the KB to find taxa showing evolutionary variation similar to that of a query gene. Matches are made between the full set of phenotypes described for a gene and an evolutionary profile, the latter of which is defined as the set of phenotypes that are variable among the daughters of any node on the taxonomic tree. Phenoscape's semantic similarity interface allows the user to assess the statistical significance of each match and flags matches that may only result from differences in annotation coverage between genetic and evolutionary studies. Tools such as this will help meet the challenge of relating the growing volume of genetic knowledge in model organisms to the diversity of phenotypes in nature. The Phenoscape KB is available at http://kb.phenoscape.org.


Assuntos
Bases de Dados Genéticas , Estudos de Associação Genética/métodos , Animais , Evolução Biológica , Biologia Computacional/métodos , Humanos , Bases de Conhecimento , Fenótipo
10.
J Biomed Semantics ; 5(1): 45, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25411634

RESUMO

BACKGROUND: Phenex (http://phenex.phenoscape.org/) is a desktop application for semantically annotating the phenotypic character matrix datasets common in evolutionary biology. Since its initial publication, we have added new features that address several major bottlenecks in the efficiency of the phenotype curation process: allowing curators during the data curation phase to provisionally request terms that are not yet available from a relevant ontology; supporting quality control against annotation guidelines to reduce later manual review and revision; and enabling the sharing of files for collaboration among curators. RESULTS: We decoupled data annotation from ontology development by creating an Ontology Request Broker (ORB) within Phenex. Curators can use the ORB to request a provisional term for use in data annotation; the provisional term can be automatically replaced with a permanent identifier once the term is added to an ontology. We added a set of annotation consistency checks to prevent common curation errors, reducing the need for later correction. We facilitated collaborative editing by improving the reliability of Phenex when used with online folder sharing services, via file change monitoring and continual autosave. CONCLUSIONS: With the addition of these new features, and in particular the Ontology Request Broker, Phenex users have been able to focus more effectively on data annotation. Phenoscape curators using Phenex have reported a smoother annotation workflow, with much reduced interruptions from ontology maintenance and file management issues.

11.
Genome Biol Evol ; 6(1): 53-64, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24336482

RESUMO

Major unresolved questions in evolutionary genetics include determining the contributions of different mutational sources to the total pool of genetic variation in a species, and understanding how these different forms of genetic variation interact with natural selection. Recent work has shown that structural variants (SVs) (insertions, deletions, inversions, and transpositions) are a major source of genetic variation, often outnumbering single nucleotide variants in terms of total bases affected. Despite the near ubiquity of SVs, major questions about their interaction with natural selection remain. For example, how does the allele frequency spectrum of SVs differ when compared with single nucleotide variants? How often do SVs affect genes, and what are the consequences? To begin to address these questions, we have systematically identified and characterized a large set of submicroscopic insertion and deletion (indel) variants (between 1 and 200 kb in length) among ten inbred lines from a single natural population of the plant species Mimulus guttatus. After extensive computational filtering, we focused on a set of 4,142 high-confidence indels that showed an experimental validation rate of 73%. All but one of these indels were less than 200 kb. Although the largest were generally at lower frequencies in the population, a surprising number of large indels are at intermediate frequencies. Although indels overlapping with genes were much rarer than expected by chance, approximately 600 genes were affected by an indel. Nucleotide-binding site leucine-rich repeat (NBS-LRR) defense response genes were the most enriched among the gene families affected. Most indels associated with genes were rare and appeared to be under purifying selection, though we do find four high-frequency derived insertion alleles that show signatures of recent positive selection.


Assuntos
Genoma de Planta , Variação Estrutural do Genoma , Mimulus/genética , Frequência do Gene , Mutação INDEL , Proteínas de Repetições Ricas em Leucina , Proteínas de Plantas/genética , Polimorfismo de Nucleotídeo Único , Proteínas/genética , Seleção Genética
13.
J Biomed Semantics ; 4(1): 34, 2013 Nov 22.
Artigo em Inglês | MEDLINE | ID: mdl-24267744

RESUMO

BACKGROUND: A hierarchical taxonomy of organisms is a prerequisite for semantic integration of biodiversity data. Ideally, there would be a single, expansive, authoritative taxonomy that includes extinct and extant taxa, information on synonyms and common names, and monophyletic supraspecific taxa that reflect our current understanding of phylogenetic relationships. DESCRIPTION: As a step towards development of such a resource, and to enable large-scale integration of phenotypic data across vertebrates, we created the Vertebrate Taxonomy Ontology (VTO), a semantically defined taxonomic resource derived from the integration of existing taxonomic compilations, and freely distributed under a Creative Commons Zero (CC0) public domain waiver. The VTO includes both extant and extinct vertebrates and currently contains 106,947 taxonomic terms, 22 taxonomic ranks, 104,736 synonyms, and 162,400 cross-references to other taxonomic resources. Key challenges in constructing the VTO included (1) extracting and merging names, synonyms, and identifiers from heterogeneous sources; (2) structuring hierarchies of terms based on evolutionary relationships and the principle of monophyly; and (3) automating this process as much as possible to accommodate updates in source taxonomies. CONCLUSIONS: The VTO is the primary source of taxonomic information used by the Phenoscape Knowledgebase (http://phenoscape.org/), which integrates genetic and evolutionary phenotype data across both model and non-model vertebrates. The VTO is useful for inferring phenotypic changes on the vertebrate tree of life, which enables queries for candidate genes for various episodes in vertebrate evolution.

14.
PeerJ ; 1: e175, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24109559

RESUMO

Background. Attribution to the original contributor upon reuse of published data is important both as a reward for data creators and to document the provenance of research findings. Previous studies have found that papers with publicly available datasets receive a higher number of citations than similar studies without available data. However, few previous analyses have had the statistical power to control for the many variables known to predict citation rate, which has led to uncertain estimates of the "citation benefit". Furthermore, little is known about patterns in data reuse over time and across datasets. Method and Results. Here, we look at citation rates while controlling for many known citation predictors and investigate the variability of data reuse. In a multivariate regression on 10,555 studies that created gene expression microarray data, we found that studies that made data available in a public repository received 9% (95% confidence interval: 5% to 13%) more citations than similar studies for which the data was not made available. Date of publication, journal impact factor, open access status, number of authors, first and last author publication history, corresponding author country, institution citation history, and study topic were included as covariates. The citation benefit varied with date of dataset deposition: a citation benefit was most clear for papers published in 2004 and 2005, at about 30%. Authors published most papers using their own datasets within two years of their first publication on the dataset, whereas data reuse papers published by third-party investigators continued to accumulate for at least six years. To study patterns of data reuse directly, we compiled 9,724 instances of third party data reuse via mention of GEO or ArrayExpress accession numbers in the full text of papers. The level of third-party data use was high: for 100 datasets deposited in year 0, we estimated that 40 papers in PubMed reused a dataset by year 2, 100 by year 4, and more than 150 data reuse papers had been published by year 5. Data reuse was distributed across a broad base of datasets: a very conservative estimate found that 20% of the datasets deposited between 2003 and 2007 had been reused at least once by third parties. Conclusion. After accounting for other factors affecting citation rate, we find a robust citation benefit from open data, although a smaller one than previously reported. We conclude there is a direct effect of third-party data reuse that persists for years beyond the time when researchers have published most of the papers reusing their own data. Other factors that may also contribute to the citation benefit are considered. We further conclude that, at least for gene expression microarray data, a substantial fraction of archived datasets are reused, and that the intensity of dataset reuse has been steadily increasing since 2003.

15.
J Appl Ichthyol ; 28(3): 300-305, 2012 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-22736877

RESUMO

The rich phenotypic diversity that characterizes the vertebrate skeleton results from evolutionary changes in regulation of genes that drive development. Although relatively little is known about the genes that underlie the skeletal variation among fish species, significant knowledge of genetics and development is available for zebrafish. Because developmental processes are highly conserved, this knowledge can be leveraged for understanding the evolution of skeletal diversity. We developed the Phenoscape Knowledgebase (KB; http://kb.phenoscape.org) to yield testable hypotheses of candidate genes involved in skeletal evolution. We developed a community anatomy ontology for fishes and ontology-based methods to represent complex free-text character descriptions of species in a computable format. With these tools, we populated the KB with comparative morphological data from the literature on over 2,500 teleost fishes (mainly Ostariophysi) resulting in over 500,000 taxon phenotype annotations. The KB integrates these data with similarly structured phenotype data from zebrafish genes (http://zfin.org). Using ontology-based reasoning, candidate genes can be inferred for the phenotypes that vary across taxa, thereby uniting genetic and phenotypic data to formulate evo-devo hypotheses. The morphological data in the KB can be browsed, sorted, and aggregated in ways that provide unprecedented possibilities for data mining and discovery.

17.
Syst Biol ; 60(2): 117-25, 2011 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-21186249

RESUMO

Phylogenetic analyses using genome-scale data sets must confront incongruence among gene trees, which in plants is exacerbated by frequent gene duplications and losses. Gene tree parsimony (GTP) is a phylogenetic optimization criterion in which a species tree that minimizes the number of gene duplications induced among a set of gene trees is selected. The run time performance of previous implementations has limited its use on large-scale data sets. We used new software that incorporates recent algorithmic advances to examine the performance of GTP on a plant data set consisting of 18,896 gene trees containing 510,922 protein sequences from 136 plant taxa (giving a combined alignment length of >2.9 million characters). The relationships inferred from the GTP analysis were largely consistent with previous large-scale studies of backbone plant phylogeny and resolved some controversial nodes. The placement of taxa that were present in few gene trees generally varied the most among GTP bootstrap replicates. Excluding these taxa either before or after the GTP analysis revealed high levels of phylogenetic support across plants. The analyses supported magnoliids sister to a eudicot + monocot clade and did not support the eurosid I and II clades. This study presents a nuclear genomic perspective on the broad-scale phylogenic relationships among plants, and it demonstrates that nuclear genes with a history of duplication and loss can be phylogenetically informative for resolving the plant tree of life.


Assuntos
Classificação/métodos , Filogenia , Plantas/classificação , Plantas/genética , Algoritmos , Etiquetas de Sequências Expressas , Genômica
18.
Syst Biol ; 59(4): 369-83, 2010 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-20547776

RESUMO

The rich knowledge of morphological variation among organisms reported in the systematic literature has remained in free-text format, impractical for use in large-scale synthetic phylogenetic work. This noncomputable format has also precluded linkage to the large knowledgebase of genomic, genetic, developmental, and phenotype data in model organism databases. We have undertaken an effort to prototype a curated, ontology-based evolutionary morphology database that maps to these genetic databases (http://kb.phenoscape.org) to facilitate investigation into the mechanistic basis and evolution of phenotypic diversity. Among the first requirements in establishing this database was the development of a multispecies anatomy ontology with the goal of capturing anatomical data in a systematic and computable manner. An ontology is a formal representation of a set of concepts with defined relationships between those concepts. Multispecies anatomy ontologies in particular are an efficient way to represent the diversity of morphological structures in a clade of organisms, but they present challenges in their development relative to single-species anatomy ontologies. Here, we describe the Teleost Anatomy Ontology (TAO), a multispecies anatomy ontology for teleost fishes derived from the Zebrafish Anatomical Ontology (ZFA) for the purpose of annotating varying morphological features across species. To facilitate interoperability with other anatomy ontologies, TAO uses the Common Anatomy Reference Ontology as a template for its upper level nodes, and TAO and ZFA are synchronized, with zebrafish terms specified as subtypes of teleost terms. We found that the details of ontology architecture have ramifications for querying, and we present general challenges in developing a multispecies anatomy ontology, including refinement of definitions, taxon-specific relationships among terms, and representation of taxonomically variable developmental pathways.


Assuntos
Evolução Biológica , Peixes/anatomia & histologia , Peixes/genética , Animais , Classificação , Biologia Computacional , Bases de Dados Factuais , Genômica
19.
Proc Natl Acad Sci U S A ; 107(26): 11889-94, 2010 Jun 29.
Artigo em Inglês | MEDLINE | ID: mdl-20547848

RESUMO

The mushroom Coprinopsis cinerea is a classic experimental model for multicellular development in fungi because it grows on defined media, completes its life cycle in 2 weeks, produces some 10(8) synchronized meiocytes, and can be manipulated at all stages in development by mutation and transformation. The 37-megabase genome of C. cinerea was sequenced and assembled into 13 chromosomes. Meiotic recombination rates vary greatly along the chromosomes, and retrotransposons are absent in large regions of the genome with low levels of meiotic recombination. Single-copy genes with identifiable orthologs in other basidiomycetes are predominant in low-recombination regions of the chromosome. In contrast, paralogous multicopy genes are found in the highly recombining regions, including a large family of protein kinases (FunK1) unique to multicellular fungi. Analyses of P450 and hydrophobin gene families confirmed that local gene duplications drive the expansions of paralogous copies and the expansions occur in independent lineages of Agaricomycotina fungi. Gene-expression patterns from microarrays were used to dissect the transcriptional program of dikaryon formation (mating). Several members of the FunK1 kinase family are differentially regulated during sexual morphogenesis, and coordinate regulation of adjacent duplications is rare. The genomes of C. cinerea and Laccaria bicolor, a symbiotic basidiomycete, share extensive regions of synteny. The largest syntenic blocks occur in regions with low meiotic recombination rates, no transposable elements, and tight gene spacing, where orthologous single-copy genes are overrepresented. The chromosome assembly of C. cinerea is an essential resource in understanding the evolution of multicellularity in the fungi.


Assuntos
Cromossomos Fúngicos/genética , Coprinus/genética , Evolução Molecular , Sequência de Bases , Mapeamento Cromossômico , Coprinus/citologia , Coprinus/crescimento & desenvolvimento , Sistema Enzimático do Citocromo P-450/genética , Primers do DNA/genética , Proteínas Fúngicas/genética , Duplicação Gênica , Genoma Fúngico , Meiose/genética , Dados de Sequência Molecular , Família Multigênica , Filogenia , Proteínas Quinases/genética , RNA Fúngico/genética , Recombinação Genética , Retroelementos/genética
20.
PLoS One ; 5(5): e10708, 2010 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-20505755

RESUMO

BACKGROUND: The wealth of phenotypic descriptions documented in the published articles, monographs, and dissertations of phylogenetic systematics is traditionally reported in a free-text format, and it is therefore largely inaccessible for linkage to biological databases for genetics, development, and phenotypes, and difficult to manage for large-scale integrative work. The Phenoscape project aims to represent these complex and detailed descriptions with rich and formal semantics that are amenable to computation and integration with phenotype data from other fields of biology. This entails reconceptualizing the traditional free-text characters into the computable Entity-Quality (EQ) formalism using ontologies. METHODOLOGY/PRINCIPAL FINDINGS: We used ontologies and the EQ formalism to curate a collection of 47 phylogenetic studies on ostariophysan fishes (including catfishes, characins, minnows, knifefishes) and their relatives with the goal of integrating these complex phenotype descriptions with information from an existing model organism database (zebrafish, http://zfin.org). We developed a curation workflow for the collection of character, taxonomic and specimen data from these publications. A total of 4,617 phenotypic characters (10,512 states) for 3,449 taxa, primarily species, were curated into EQ formalism (for a total of 12,861 EQ statements) using anatomical and taxonomic terms from teleost-specific ontologies (Teleost Anatomy Ontology and Teleost Taxonomy Ontology) in combination with terms from a quality ontology (Phenotype and Trait Ontology). Standards and guidelines for consistently and accurately representing phenotypes were developed in response to the challenges that were evident from two annotation experiments and from feedback from curators. CONCLUSIONS/SIGNIFICANCE: The challenges we encountered and many of the curation standards and methods for improving consistency that we developed are generally applicable to any effort to represent phenotypes using ontologies. This is because an ontological representation of the detailed variations in phenotype, whether between mutant or wildtype, among individual humans, or across the diversity of species, requires a process by which a precise combination of terms from domain ontologies are selected and organized according to logical relations. The efficiencies that we have developed in this process will be useful for any attempt to annotate complex phenotypic descriptions using ontologies. We also discuss some ramifications of EQ representation for the domain of systematics.


Assuntos
Evolução Biológica , Biologia Computacional/métodos , Bases de Dados Genéticas , Publicações , Biologia de Sistemas , Animais , Peixes/crescimento & desenvolvimento , Fenótipo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...