Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 14 de 14
Filter
Add more filters










Publication year range
1.
Nat Ecol Evol ; 8(7): 1224-1232, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38789640

ABSTRACT

Genetic and genomic data are collected for a vast array of scientific and applied purposes. Despite mandates for public archiving, data are typically used only by the generating authors. The reuse of genetic and genomic datasets remains uncommon because it is difficult, if not impossible, due to non-standard archiving practices and lack of contextual metadata. But as the new field of macrogenetics is demonstrating, if genetic data and their metadata were more accessible and FAIR (findable, accessible, interoperable and reusable) compliant, they could be reused for many additional purposes. We discuss the main challenges with existing genetic and genomic data archives, and suggest best practices for archiving genetic and genomic data. Recognizing that this is a longstanding issue due to little formal data management training within the fields of ecology and evolution, we highlight steps that research institutions and publishers could take to improve data archiving.


Subject(s)
Genomics , Databases, Genetic , Data Management , Metadata
2.
Conserv Biol ; 37(4): e14061, 2023 08.
Article in English | MEDLINE | ID: mdl-36704891

ABSTRACT

Genetic diversity within species represents a fundamental yet underappreciated level of biodiversity. Because genetic diversity can indicate species resilience to changing climate, its measurement is relevant to many national and global conservation policy targets. Many studies produce large amounts of genome-scale genetic diversity data for wild populations, but most (87%) do not include the associated spatial and temporal metadata necessary for them to be reused in monitoring programs or for acknowledging the sovereignty of nations or Indigenous peoples. We undertook a distributed datathon to quantify the availability of these missing metadata and to test the hypothesis that their availability decays with time. We also worked to remediate missing metadata by extracting them from associated published papers, online repositories, and direct communication with authors. Starting with 848 candidate genomic data sets (reduced representation and whole genome) from the International Nucleotide Sequence Database Collaboration, we determined that 561 contained mostly samples from wild populations. We successfully restored spatiotemporal metadata for 78% of these 561 data sets (n = 440 data sets with data on 45,105 individuals from 762 species in 17 phyla). Examining papers and online repositories was much more fruitful than contacting 351 authors, who replied to our email requests 45% of the time. Overall, 23% of our email queries to authors unearthed useful metadata. The probability of retrieving spatiotemporal metadata declined significantly as age of the data set increased. There was a 13.5% yearly decrease in metadata associated with published papers or online repositories and up to a 22% yearly decrease in metadata that were only available from authors. This rapid decay in metadata availability, mirrored in studies of other types of biological data, should motivate swift updates to data-sharing policies and researcher practices to ensure that the valuable context provided by metadata is not lost to conservation science forever.


Importancia de la curación oportuna de metadatos para la vigilancia mundial de la diversidad genética Resumen La diversidad genética intraespecífica representa un nivel fundamental, pero a la vez subvalorado de la biodiversidad. La diversidad genética puede indicar la resiliencia de una especie ante el clima cambiante, por lo que su medición es relevante para muchos objetivos de la política de conservación mundial y nacional. Muchos estudios producen una gran cantidad de datos sobre la diversidad a nivel genético de las poblaciones silvestres, aunque la mayoría (87%) no incluye los metadatos espaciales y temporales asociados para que sean reutilizados en los programas de monitoreo o para reconocer la soberanía de las naciones o los pueblos indígenas. Realizamos un "datatón" distribuido para cuantificar la disponibilidad de estos metadatos faltantes y para probar la hipótesis que supone que esta disponibilidad se deteriora con el tiempo. También trabajamos para reparar los metadatos faltantes al extraerlos de los artículos asociados publicados, los repositorios en línea y la comunicación directa con los autores. Iniciamos con 838 candidatos de conjuntos de datos genómicos (representación reducida y genoma completo) tomados de la colaboración internacional para la base de datos de secuencias de nucleótidos y determinamos que 561 incluían en su mayoría muestras tomadas de poblaciones silvestres. Restauramos con éxito los metadatos espaciotemporales en el 78% de estos 561 conjuntos de datos (n = 440 conjuntos de datos con información sobre 45,105 individuos de 762 especies en 17 filos). El análisis de los artículos y los repositorios virtuales fue mucho más productivo que contactar a los 351 autores, quienes tuvieron un 45% de respuesta a nuestros correos. En general, el 23% de nuestras consultas descubrieron metadatos útiles. La probabilidad de recuperar metadatos espaciotemporales declinó de manera significativa conforme incrementó la antigüedad del conjunto de datos. Hubo una disminución anual del 13.5% en los metadatos asociados con los artículos publicados y los repositorios virtuales y hasta una disminución anual del 22% en los metadatos que sólo estaban disponibles mediante la comunicación con los autores. Este rápido deterioro en la disponibilidad de los metadatos, duplicado en estudios de otros tipos de datos biológicos, debería motivar la pronta actualización de las políticas del intercambio de datos y las prácticas de los investigadores para asegurar que en las ciencias de la conservación no se pierda para siempre el contexto valioso proporcionado por los metadatos.


Subject(s)
Conservation of Natural Resources , Metadata , Humans , Biodiversity , Probability , Genetic Variation
3.
PeerJ ; 9: e12063, 2021.
Article in English | MEDLINE | ID: mdl-34540369

ABSTRACT

BACKGROUND: Understanding region-wide patterns of larval connectivity and gene flow is crucial for managing and conserving marine biodiversity. Dongsha Atoll National Park (DANP), located in the northern South China Sea (SCS), was established in 2007 to study and conserve this diverse and remote coral atoll. However, the role of Dongsha Atoll in connectivity throughout the SCS is seldom studied. In this study, we aim to evaluate the role of DANP in conserving regional marine biodiversity. METHODS: In total, 206 samples across nine marine species were collected and sequenced from Dongsha Atoll, and these data were combined with available sequence data from each of these nine species archived in the Genomic Observatories Metadatabase (GEOME). Together, these data provide the most extensive population genetic analysis of a single marine protected area. We evaluate metapopulation structure for each species by using a coalescent sampler, selecting among panmixia, stepping-stone, and island models of connectivity in a likelihood-based framework. We then completed a heuristic graph theoretical analysis based on maximum dispersal distance to get a sense of Dongsha's centrality within the SCS. RESULTS: Our dataset yielded 111 unique haplotypes across all taxa at DANP, 58% of which were not sampled elsewhere. Analysis of metapopulation structure showed that five out of nine species have strong regional connectivity across the SCS such that their gene pools are effectively panmictic (mean pelagic larval duration (PLD) = 78 days, sd = 60 days); while four species have stepping-stone metapopulation structure, indicating that larvae are exchanged primarily between nearby populations (mean PLD = 37 days, sd = 15 days). For all but one species, Dongsha was ranked within the top 15 out of 115 large reefs in the South China Sea for betweenness centrality. Thus, for most species, Dongsha Atoll provides an essential link for maintaining stepping-stone gene flow across the SCS. CONCLUSIONS: This multispecies study provides the most comprehensive examination of the role of Dongsha Atoll in marine connectivity in the South China Sea to date. Combining new and existing population genetic data for nine coral reef species in the region with a graph theoretical analysis, this study provides evidence that Dongsha Atoll is an important hub for sustaining connectivity for the majority of coral-reef species in the region.

4.
Proc Natl Acad Sci U S A ; 118(34)2021 08 24.
Article in English | MEDLINE | ID: mdl-34404731

ABSTRACT

Genomic data are being produced and archived at a prodigious rate, and current studies could become historical baselines for future global genetic diversity analyses and monitoring programs. However, when we evaluated the potential utility of genomic data from wild and domesticated eukaryote species in the world's largest genomic data repository, we found that most archived genomic datasets (86%) lacked the spatiotemporal metadata necessary for genetic biodiversity surveillance. Labor-intensive scouring of a subset of published papers yielded geospatial coordinates and collection years for only 33% (39% if place names were considered) of these genomic datasets. Streamlined data input processes, updated metadata deposition policies, and enhanced scientific community awareness are urgently needed to preserve these irreplaceable records of today's genetic biodiversity and to plug the growing metadata gap.


Subject(s)
Biodiversity , Data Accuracy , Eukaryota/genetics , Genetic Variation , Genome , Genomics/methods , Population Dynamics
5.
Mol Ecol Resour ; 20(6): 1458-1469, 2020 Nov.
Article in English | MEDLINE | ID: mdl-33031625

ABSTRACT

Genetic data represent a relatively new frontier for our understanding of global biodiversity. Ideally, such data should include both organismal DNA-based genotypes and the ecological context where the organisms were sampled. Yet most tools and standards for data deposition focus exclusively either on genetic or ecological attributes. The Genomic Observatories Metadatabase (GEOME: geome-db.org) provides an intuitive solution for maintaining links between genetic data sets stored by the International Nucleotide Sequence Database Collaboration (INSDC) and their associated ecological metadata. GEOME facilitates the deposition of raw genetic data to INSDCs sequence read archive (SRA) while maintaining persistent links to standards-compliant ecological metadata held in the GEOME database. This approach facilitates findable, accessible, interoperable and reusable data archival practices. Moreover, GEOME enables data management solutions for large collaborative groups and expedites batch retrieval of genetic data from the SRA. The article that follows describes how GEOME can enable genuinely open data workflows for researchers in the field of molecular ecology.


Subject(s)
Biodiversity , Databases, Nucleic Acid , Genomics , Metadata , Research , Ecology , Information Storage and Retrieval , Workflow
6.
Evol Appl ; 12(2): 255-265, 2019 Feb.
Article in English | MEDLINE | ID: mdl-30697337

ABSTRACT

Assessing the geographic structure of populations has relied heavily on Sewell Wright's F-statistics and their numerous analogues for many decades. However, it is well appreciated that, due to their nonlinear relationship with gene flow, F-statistics frequently fail to reject the null model of panmixia in species with relatively high levels of gene flow and large population sizes. Coalescent genealogy samplers instead allow a model-selection approach to the characterization of population structure, thereby providing the opportunity for stronger inference. Here, we validate the use of coalescent samplers in a high gene flow context using simulations of a stepping-stone model. In an example case study, we then re-analyze genetic datasets from 41 marine species sampled from throughout the Hawaiian archipelago using coalescent model selection. Due to the archipelago's linear nature, it is expected that most species will conform to some sort of stepping-stone model (leading to an expected pattern of isolation by distance), but F-statistics have only supported this inference in ~10% of these datasets. Our simulation analysis shows that a coalescent sampler can make a correct inference of stepping-stone gene flow in nearly 100% of cases where gene flow is ≤100 migrants per generation (equivalent to F ST = 0.002), while F-statistics had mixed results. Our re-analysis of empirical datasets found that nearly 70% of datasets with an unambiguous result fit a stepping-stone model with varying population sizes and rates of gene flow, although 37% of datasets yielded ambiguous results. Together, our results demonstrate that coalescent samplers hold great promise for detecting weak but meaningful population structure, and defining appropriate management units.

7.
PLoS Biol ; 15(8): e2002925, 2017 Aug.
Article in English | MEDLINE | ID: mdl-28771471

ABSTRACT

The Genomic Observatories Metadatabase (GeOMe, http://www.geome-db.org/) is an open access repository for geographic and ecological metadata associated with biosamples and genetic data. Whereas public databases have served as vital repositories for nucleotide sequences, they do not accession all the metadata required for ecological or evolutionary analyses. GeOMe fills this need, providing a user-friendly, web-based interface for both data contributors and data recipients. The interface allows data contributors to create a customized yet standard-compliant spreadsheet that captures the temporal and geospatial context of each biosample. These metadata are then validated and permanently linked to archived genetic data stored in the National Center for Biotechnology Information's (NCBI's) Sequence Read Archive (SRA) via unique persistent identifiers. By linking ecologically and evolutionarily relevant metadata with publicly archived sequence data in a structured manner, GeOMe sets a gold standard for data management in biodiversity science.


Subject(s)
Biodiversity , Databases, Nucleic Acid , Metadata , Metagenomics
8.
Glob Chang Biol ; 22(2): 465-73, 2016 Feb.
Article in English | MEDLINE | ID: mdl-26618788

ABSTRACT

Accelerated loss of sea ice in the Arctic is opening routes connecting the Atlantic and Pacific Oceans for longer periods each year. These changes may increase the ease and frequency with which marine birds and mammals move between the Pacific and Atlantic Ocean basins. Indeed, recent observations of birds and mammals suggest these movements have intensified in recent decades. Reconnection of the Pacific and Atlantic Ocean basins will present both challenges to marine ecosystem conservation and an unprecedented opportunity to examine the ecological and evolutionary consequences of interoceanic faunal exchange in real time. To understand these changes and implement effective conservation of marine ecosystems, we need to further develop modeling efforts to predict the rate of dispersal and consequences of faunal exchange. These predictions can be tested by closely monitoring wildlife dispersal through the Arctic Ocean and using modern methods to explore the ecological and evolutionary consequences of these movements.


Subject(s)
Animal Migration , Conservation of Natural Resources , Animals , Ecosystem , Oceans and Seas
9.
Curr Zool ; 62(6): 581-601, 2016 Dec.
Article in English | MEDLINE | ID: mdl-29491947

ABSTRACT

Population genomic approaches are making rapid inroads in the study of non-model organisms, including marine taxa. To date, these marine studies have predominantly focused on rudimentary metrics describing the spatial and environmental context of their study region (e.g., geographical distance, average sea surface temperature, average salinity). We contend that a more nuanced and considered approach to quantifying seascape dynamics and patterns can strengthen population genomic investigations and help identify spatial, temporal, and environmental factors associated with differing selective regimes or demographic histories. Nevertheless, approaches for quantifying marine landscapes are complicated. Characteristic features of the marine environment, including pelagic living in flowing water (experienced by most marine taxa at some point in their life cycle), require a well-designed spatial-temporal sampling strategy and analysis. Many genetic summary statistics used to describe populations may be inappropriate for marine species with large population sizes, large species ranges, stochastic recruitment, and asymmetrical gene flow. Finally, statistical approaches for testing associations between seascapes and population genomic patterns are still maturing with no single approach able to capture all relevant considerations. None of these issues are completely unique to marine systems and therefore similar issues and solutions will be shared for many organisms regardless of habitat. Here, we outline goals and spatial approaches for landscape genomics with an emphasis on marine systems and review the growing empirical literature on seascape genomics. We review established tools and approaches and highlight promising new strategies to overcome select issues including a strategy to spatially optimize sampling. Despite the many challenges, we argue that marine systems may be especially well suited for identifying candidate genomic regions under environmentally mediated selection and that seascape genomic approaches are especially useful for identifying robust locus-by-environment associations.

10.
PLoS One ; 10(7): e0131276, 2015.
Article in English | MEDLINE | ID: mdl-26200779

ABSTRACT

Understanding seasonal migration and localized persistence of populations is critical for effective species harvest and conservation management. Pacific salmon (genus Oncorhynchus) forecasting models predict stock composition, abundance, and distribution during annual assessments of proposed fisheries impacts. Most models, however, fail to account for the influence of biophysical factors on year-to-year fluctuations in migratory distributions and stock-specific survival. In this study, the ocean distribution and relative abundance of Chinook salmon (O. tshawytscha) stocks encountered in the California Current large marine ecosystem, U.S.A were inferred using catch-per-unit effort (CPUE) fisheries and genetic stock identification data. In contrast to stock distributions estimated through coded-wire-tag recoveries (typically limited to hatchery salmon), stock-specific CPUE provides information for both wild and hatchery fish. Furthermore, in contrast to stock composition results, the stock-specific CPUE metric is independent of other stocks and is easily interpreted over multiple temporal or spatial scales. Tests for correlations between stock-specific CPUE and stock composition estimates revealed these measures diverged once proportional contributions of locally rare stocks were excluded from data sets. A novel aspect of this study was collection of data both in areas closed to commercial fisheries and during normal, open commercial fisheries. Because fishing fleet efficiency influences catch rates, we tested whether CPUE differed between closed area (non-retention) and open area (retention) data sets. A weak effect was indicated for some, but not all, analyzed cases. Novel visualizations produced from stock-specific CPUE-based ocean abundance facilitates consideration of how highly refined, spatial and genetic information could be incorporated in ocean fisheries management systems and for investigations of biogeographic factors that influence migratory distributions of fish.


Subject(s)
Animal Migration/physiology , Salmon/physiology , Seasons , Animals , Fisheries , Pacific Ocean , United States
11.
Mol Ecol ; 21(22): 5579-98, 2012 Nov.
Article in English | MEDLINE | ID: mdl-23050562

ABSTRACT

Marine species in the Indo-Pacific have ranges that can span thousands of kilometres, yet studies increasingly suggest that mean larval dispersal distances are less than historically assumed. Gene flow across these ranges must therefore rely to some extent on larval dispersal among intermediate 'stepping-stone' populations in combination with long-distance dispersal far beyond the mean of the dispersal kernel. We evaluate the strength of stepping-stone dynamics by employing a spatially explicit biophysical model of larval dispersal in the tropical Pacific to construct hypotheses for dispersal pathways. We evaluate these hypotheses with coalescent models of gene flow among high-island archipelagos in four neritid gastropod species. Two of the species live in the marine intertidal, while the other two are amphidromous, living in fresh water but retaining pelagic dispersal. Dispersal pathways predicted by the biophysical model were strongly favoured in 16 of 18 tests against alternate hypotheses. In regions where connectivity among high-island archipelagos was predicted as direct, there was no difference in gene flow between marine and amphidromous species. In regions where connectivity was predicted through stepping-stone atolls only accessible to marine species, gene flow estimates between high-island archipelagos were significantly higher in marine species. Moreover, one of the marine species showed a significant pattern of isolation by distance consistent with stepping-stone dynamics. While our results support stepping-stone dynamics in Indo-Pacific species, we also see evidence for nonequilibrium processes such as range expansions or rare long-distance dispersal events. This study couples population genetic and biophysical models to help to shed light on larval dispersal pathways.


Subject(s)
Animal Distribution , Gene Flow , Models, Genetic , Snails/genetics , Animals , Bayes Theorem , Ecosystem , Genetic Variation , Genetics, Population , Geography , Larva/genetics , Pacific Islands
12.
Mol Biol Evol ; 29(2): 707-19, 2012 Feb.
Article in English | MEDLINE | ID: mdl-21926069

ABSTRACT

The rate of change in DNA is an important parameter for understanding molecular evolution and hence for inferences drawn from studies of phylogeography and phylogenetics. Most rate calibrations for mitochondrial coding regions in marine species have been made from divergence dating for fossils and vicariant events older than 1-2 My and are typically 0.5-2% per lineage per million years. Recently, calibrations made with ancient DNA (aDNA) from younger dates have yielded faster rates, suggesting that estimates of the molecular rate of change depend on the time of calibration, decaying from the instantaneous mutation rate to the phylogenetic substitution rate. aDNA methods for recent calibrations are not available for most marine taxa so instead we use radiometric dates for sea-level rise onto the Sunda Shelf following the Last Glacial Maximum (starting ∼18,000 years ago), which led to massive population expansions for marine species. Instead of divergence dating, we use a two-epoch coalescent model of logistic population growth preceded by a constant population size to infer a time in mutational units for the beginning of these expansion events. This model compares favorably to simpler coalescent models of constant population size, and exponential or logistic growth, and is far more precise than estimates from the mismatch distribution. Mean rates estimated with this method for mitochondrial coding genes in three invertebrate species are elevated in comparison to older calibration points (2.3-6.6% per lineage per million years), lending additional support to the hypothesis of calibration time dependency for molecular rates.


Subject(s)
Aquatic Organisms/genetics , Biological Evolution , Evolution, Molecular , Phylogeography/methods , Animals , Arthropods/genetics , Bivalvia/genetics , DNA/genetics , Echinodermata/genetics , Genes, Mitochondrial , Genetic Variation , Ice Cover , Mitochondria/genetics , Models, Genetic , Mutation Rate , Phylogeny , Time Factors
13.
Mol Ecol ; 17(24): 5276-90, 2008 Dec.
Article in English | MEDLINE | ID: mdl-19067797

ABSTRACT

Repeated exposure and flooding of the Sunda and Sahul shelves during Pleistocene sea-level fluctuations is thought to have contributed to the isolation and diversification of sea-basin populations within the Coral Triangle. This hypothesis has been tested in numerous phylogeographical studies, recovering an assortment of genetic patterns that the authors have generally attributed to differences in larval dispersal capability or adult habitat specificity. This study compares phylogeographical patterns from mitochondrial COI sequences among two co-distributed seastars that differ in their adult habitat and dispersal ability, and two seastar ectosymbionts that differ in their degree of host specificity. Of these, only the seastar Linckia laevigata displayed a classical pattern of Indian-Pacific divergence, but with only moderate genetic structure (PhiCT = 0.067). In contrast, the seastarProtoreaster nodosus exhibited strong structure (PhiCT = 0.23) between Teluk Cenderawasih and the remainder of Indonesia, a pattern of regional structure that was echoed in L. laevigata (PhiCT = 0.03) as well as its obligate gastropod parasite Thyca crystallina (PhiCT = 0.04). The generalist commensal shrimp, Periclimenes soror showed little genetic structuring across the Coral Triangle. Despite species-specific phylogeographical patterns, all four species showed departures from neutrality that are consistent with massive range expansions onto the continental shelves as the sea levels rose, and that date within the Pleistocene epoch.Our results suggest that habitat differences may affect the manner in which species responded to Pleistocene sea-level fluctuations, shaping contemporary patterns of genetic structure and diversity.


Subject(s)
Anthozoa/genetics , Genetics, Population , Phylogeny , Symbiosis , Animals , DNA, Mitochondrial/genetics , Decapoda/genetics , Ecosystem , Evolution, Molecular , Gastropoda/genetics , Geography , Haplotypes , Indian Ocean , Indonesia , Pacific Ocean , Sequence Alignment , Sequence Analysis, DNA , Species Specificity
14.
Mol Ecol ; 17(2): 611-26, 2008 Jan.
Article in English | MEDLINE | ID: mdl-18179436

ABSTRACT

Marine species with ranges that span the Indo-Australian Archipelago (IAA) exhibit a range of phylogeographical patterns, most of which are interpreted in the context of vicariance between Indian and Pacific Ocean populations during Pliocene and Pleistocene low sea-level stands. However, patterns often vary among ecologically similar taxa, sometimes even within genera. This study compares phylogeographical patterns in two species of highly dispersive neritid gastropod, Nerita albicilla and Nerita plicata, with nearly sympatric ranges that span the Indo-Pacific. Mitochondrial COI sequences from >1000 individuals from 97 sites reveal similar phylogenies in both species (two divergent clades differing by 3.2% and 2.3%, for N. albicilla and N. plicata, respectively). However, despite ecological similarity and congeneric status, the two species exhibit phylogeographical discordance. N. albicilla has maintained reciprocal monophyly of Indian and Pacific Ocean populations, while N. plicata is panmictic between oceans, but displays a genetic cline in the Central Pacific. Although this difference might be explained by qualitatively different demographic histories, parameter estimates from three coalescent models indicate that both species have high levels of gene flow between demes (2Nem>75), and share a common history of population expansion that is likely associated with cyclical flooding of continental shelves and island lagoons following low sea-level stands. Results indicate that ecologically similar, codistributed species may respond very differently to shared environmental processes, suggesting that relatively minor differences in traits such as pelagic larval duration or microhabitat association may profoundly impact phylogeographical structure.


Subject(s)
Gastropoda/genetics , Phylogeny , Animals , Australia , DNA, Mitochondrial/chemistry , DNA, Mitochondrial/genetics , Gastropoda/classification , Genetic Variation , Genetics, Population , Geography , Molecular Sequence Data , Pacific Ocean , Sequence Analysis, DNA
SELECTION OF CITATIONS
SEARCH DETAIL
...