Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 25
Filter
Add more filters










Publication year range
1.
Philos Trans R Soc Lond B Biol Sci ; 379(1904): 20230104, 2024 Jun 24.
Article in English | MEDLINE | ID: mdl-38705176

ABSTRACT

Technological advancements in biological monitoring have facilitated the study of insect communities at unprecedented spatial scales. The progress allows more comprehensive coverage of the diversity within a given area while minimizing disturbance and reducing the need for extensive human labour. Compared with traditional methods, these novel technologies offer the opportunity to examine biological patterns that were previously beyond our reach. However, to address the pressing scientific inquiries of the future, data must be easily accessible, interoperable and reusable for the global research community. Biodiversity information standards and platforms provide the necessary infrastructure to standardize and share biodiversity data. This paper explores the possibilities and prerequisites of publishing insect data obtained through novel monitoring methods through GBIF, the most comprehensive global biodiversity data infrastructure. We describe the essential components of metadata standards and existing data standards for occurrence data on insects, including data extensions. By addressing the current opportunities, limitations, and future development of GBIF's publishing framework, we hope to encourage researchers to both share data and contribute to the further development of biodiversity data standards and publishing models. Wider commitments to open data initiatives will promote data interoperability and support cross-disciplinary scientific research and key policy indicators. This article is part of the theme issue 'Towards a toolkit for global insect biodiversity monitoring'.


Subject(s)
Biodiversity , Information Dissemination , Insecta , Animals , Entomology/methods , Entomology/standards , Information Dissemination/methods , Metadata
2.
GigaByte ; 2024: gigabyte117, 2024.
Article in English | MEDLINE | ID: mdl-38646088

ABSTRACT

There is an increased awareness of the importance of data publication, data sharing, and open science to support research, monitoring and control of vector-borne disease (VBD). Here we describe the efforts of the Global Biodiversity Information Facility (GBIF) as well as the World Health Special Programme on Research and Training in Diseases of Poverty (TDR) to promote publication of data related to vectors of diseases. In 2020, a GBIF task group of experts was formed to provide advice and support efforts aimed at enhancing the coverage and accessibility of data on vectors of human diseases within GBIF. Various strategies, such as organizing training courses and publishing data papers, were used to increase this content. This editorial introduces the outcome of a second call for data papers partnered by the TDR, GBIF and GigaScience Press in the journal GigaByte. Biodiversity and infectious diseases are linked in complex ways. These links can involve changes from the microorganism level to that of the habitat, and there are many ways in which these factors interact to affect human health. One way to tackle disease control and possibly elimination, is to provide stakeholders with access to a wide range of data shared under the FAIR principles, so it is possible to support early detection, analyses and evaluation, and to promote policy improvements and/or development.

3.
Nucleic Acids Res ; 52(D1): D791-D797, 2024 Jan 05.
Article in English | MEDLINE | ID: mdl-37953409

ABSTRACT

UNITE (https://unite.ut.ee) is a web-based database and sequence management environment for molecular identification of eukaryotes. It targets the nuclear ribosomal internal transcribed spacer (ITS) region and offers nearly 10 million such sequences for reference. These are clustered into ∼2.4M species hypotheses (SHs), each assigned a unique digital object identifier (DOI) to promote unambiguous referencing across studies. UNITE users have contributed over 600 000 third-party sequence annotations, which are shared with a range of databases and other community resources. Recent improvements facilitate the detection of cross-kingdom biological associations and the integration of undescribed groups of organisms into everyday biological pursuits. Serving as a digital twin for eukaryotic biodiversity and communities worldwide, the latest release of UNITE offers improved avenues for biodiversity discovery, precise taxonomic communication and integration of biological knowledge across platforms.


Subject(s)
Databases, Nucleic Acid , Fungi , DNA, Ribosomal Spacer , Fungi/genetics , Biodiversity , DNA, Fungal , Phylogeny
4.
Trends Ecol Evol ; 38(10): 916-926, 2023 10.
Article in English | MEDLINE | ID: mdl-37208222

ABSTRACT

Digital twins (DTs) are an emerging phenomenon in the public and private sectors as a new tool to monitor and understand systems and processes. DTs have the potential to change the status quo in ecology as part of its digital transformation. However, it is important to avoid misguided developments by managing expectations about DTs. We stress that DTs are not just big models of everything, containing big data and machine learning. Rather, the strength of DTs is in combining data, models, and domain knowledge, and their continuous alignment with the real world. We suggest that researchers and stakeholders exercise caution in DT development, keeping in mind that many of the strengths and challenges of computational modelling in ecology also apply to DTs.


Subject(s)
Computer Simulation , Ecology , Big Data , Machine Learning
5.
MycoKeys ; 96: 77-95, 2023.
Article in English | MEDLINE | ID: mdl-37214177

ABSTRACT

The MycoPins method described here is a rapid and affordable protocol to monitor early colonization events in communities of wood-inhabiting fungi in fine woody debris. It includes easy to implement field sampling techniques and sample processing, followed by data processing, and analysis of the development of early dead wood fungal communities. The method is based on fieldwork from a time series experiment on standard sterilized colonization targets followed by the metabarcoding analysis and automated molecular identification of species. This new monitoring method through its simplicity, moderate costs, and scalability paves a way for a broader and scalable project pipeline. MycoPins establishes a standard routine for research stations or regularly visited field sites for monitoring of fungal colonization of woody substrates. The routine uses widely available consumables and therefore presents a unifying method for monitoring of fungi of this type.

6.
One Health ; 16: 100484, 2023 Jun.
Article in English | MEDLINE | ID: mdl-36714536

ABSTRACT

The unprecedented generation of large volumes of biodiversity data is consistently contributing to a wide range of disciplines, including disease ecology. Emerging infectious diseases are usually zoonoses caused by multi-host pathogens. Therefore, their understanding may require the access to biodiversity data related to the ecology and the occurrence of the species involved. Nevertheless, despite several data-mobilization initiatives, the usage of biodiversity data for research into disease dynamics has not yet been fully leveraged. To explore current contribution, trends, and to identify limitations, we characterized biodiversity data usage in scientific publications related to human health, contrasting patterns of studies citing the Global Biodiversity Information Facility (GBIF) with those obtaining data from other sources. We found that the studies mainly obtained data from scientific literature and other not aggregated or standardized sources. Most of the studies explored pathogen species and, particularly those with GBIF-mediated data, tended to explore and reuse data of multiple species (>2). Data sources varied according to the taxa and epidemiological roles of the species involved. Biodiversity data repositories were mainly used for species related to hosts, reservoirs, and vectors, and barely used as a source of pathogens data, which was usually obtained from human and animal-health related institutions. While both GBIF- and not GBIF-mediated data studies explored similar diseases and topics, they presented discipline biases and different analytical approaches. Research on emerging infectious diseases may require the access to geographical and ecological data of multiple species. The One Health challenge requires interdisciplinary collaboration and data sharing, which is facilitated by aggregated repositories and platforms. The contribution of biodiversity data to understand infectious disease dynamics should be acknowledged, strengthened, and promoted.

7.
Gigascience ; 112022 11 03.
Article in English | MEDLINE | ID: mdl-36329618

ABSTRACT

Vector-borne diseases are responsible for more than 17% of human cases of infectious diseases. In most situations, effective control of debilitating and deadly vector-bone diseases (VBDs), such as malaria, dengue, chikungunya, yellow fever, Zika and Chagas requires up-to-date, robust and comprehensive information on the presence, diversity, ecology, bionomics and geographic spread of the organisms that carry and transmit the infectious agents. Huge gaps exist in the information related to these vectors, creating an essential need for campaigns to mobilise and share data. The publication of data papers is an effective tool for overcoming this challenge. These peer-reviewed articles provide scholarly credit for researchers whose vital work of assembling and publishing well-described, properly-formatted datasets often fails to receive appropriate recognition. To address this, GigaScience's sister journal GigaByte partnered with the Global Biodiversity Information Facility (GBIF) to publish a series of data papers, with support from the Special Programme for Research and Training in Tropical Diseases (TDR), hosted by the World Health Organisation (WHO). Here we outline the initial results of this targeted approach to sharing data and describe its importance for controlling VBDs and improving public health.


Subject(s)
Communicable Diseases , Zika Virus Infection , Zika Virus , Animals , Humans , Disease Vectors , Publishing
8.
Trends Ecol Evol ; 37(10): 872-885, 2022 10.
Article in English | MEDLINE | ID: mdl-35811172

ABSTRACT

Insects are the most diverse group of animals on Earth, but their small size and high diversity have always made them challenging to study. Recent technological advances have the potential to revolutionise insect ecology and monitoring. We describe the state of the art of four technologies (computer vision, acoustic monitoring, radar, and molecular methods), and assess their advantages, current limitations, and future potential. We discuss how these technologies can adhere to modern standards of data curation and transparency, their implications for citizen science, and their potential for integration among different monitoring programmes and technologies. We argue that they provide unprecedented possibilities for insect ecology and monitoring, but it will be important to foster international standards via collaboration.


Subject(s)
Ecology , Insecta , Animals , Ecology/methods
9.
Sci Data ; 9(1): 391, 2022 07 09.
Article in English | MEDLINE | ID: mdl-35810161

ABSTRACT

The Country Compendium of the Global Register of Introduced and Invasive Species (GRIIS) is a collation of data across 196 individual country checklists of alien species, along with a designation of those species with evidence of impact at a country level. The Compendium provides a baseline for monitoring the distribution and invasion status of all major taxonomic groups, and can be used for the purpose of global analyses of introduced (alien, non-native, exotic) and invasive species (invasive alien species), including regional, single and multi-species taxon assessments and comparisons. It enables exploration of gaps and inferred absences of species across countries, and also provides one means for updating individual GRIIS Checklists. The Country Compendium is, for example, instrumental, along with data on first records of introduction, for assessing and reporting on invasive alien species targets, including for the Convention on Biological Diversity and Sustainable Development Goals. The GRIIS Country Compendium provides a baseline and mechanism for tracking the spread of introduced and invasive alien species across countries globally. Design Type(s) Data integration objective ● Observation design Measurement Type(s) Alien species occurrence ● Evidence of impact invasive alien species assessment objective Technology Type(s) Agent expert ● Data collation Factor Type(s) Geographic location ● Origin / provenance ● Habitat Sample Characteristics - Organism Animalia ● Bacteria ● Chromista ● Fungi ● Plantae ● Protista (Protozoa) ● Viruses Sample Characteristics - Location Global countries.


Subject(s)
Biodiversity , Introduced Species , Ecosystem , Eukaryota , Fungi , Plants
10.
Gigascience ; 122022 12 28.
Article in English | MEDLINE | ID: mdl-37632753

ABSTRACT

Omic BON is a thematic Biodiversity Observation Network under the Group on Earth Observations Biodiversity Observation Network (GEO BON), focused on coordinating the observation of biomolecules in organisms and the environment. Our founding partners include representatives from national, regional, and global observing systems; standards organizations; and data and sample management infrastructures. By coordinating observing strategies, methods, and data flows, Omic BON will facilitate the co-creation of a global omics meta-observatory to generate actionable knowledge. Here, we present key elements of Omic BON's founding charter and first activities.


Subject(s)
Biodiversity , Knowledge
11.
Proc Natl Acad Sci U S A ; 118(6)2021 02 09.
Article in English | MEDLINE | ID: mdl-33526679

ABSTRACT

The accessibility of global biodiversity information has surged in the past two decades, notably through widespread funding initiatives for museum specimen digitization and emergence of large-scale public participation in community science. Effective use of these data requires the integration of disconnected datasets, but the scientific impacts of consolidated biodiversity data networks have not yet been quantified. To determine whether data integration enables novel research, we carried out a quantitative text analysis and bibliographic synthesis of >4,000 studies published from 2003 to 2019 that use data mediated by the world's largest biodiversity data network, the Global Biodiversity Information Facility (GBIF). Data available through GBIF increased 12-fold since 2007, a trend matched by global data use with roughly two publications using GBIF-mediated data per day in 2019. Data-use patterns were diverse by authorship, geographic extent, taxonomic group, and dataset type. Despite facilitating global authorship, legacies of colonial science remain. Studies involving species distribution modeling were most prevalent (31% of literature surveyed) but recently shifted in focus from theory to application. Topic prevalence was stable across the 17-y period for some research areas (e.g., macroecology), yet other topics proportionately declined (e.g., taxonomy) or increased (e.g., species interactions, disease). Although centered on biological subfields, GBIF-enabled research extends surprisingly across all major scientific disciplines. Biodiversity data mobilization through global data aggregation has enabled basic and applied research use at temporal, spatial, and taxonomic scales otherwise not possible, launching biodiversity sciences into a new era.


Subject(s)
Biodiversity , Databases, Factual/standards , Animals , Classification , Humans , Museums
12.
Front Microbiol ; 11: 598321, 2020.
Article in English | MEDLINE | ID: mdl-33362746

ABSTRACT

Uzbekistan, located in Central Asia, harbors high diversity of woody plants. Diversity of wood-inhabiting fungi in the country, however, remained poorly known. This study summarizes the wood-inhabiting basidiomycte fungi (poroid and corticoid fungi plus similar taxa such as Merismodes, Phellodon, and Sarcodon) (Agaricomycetes, Basidiomycota) that have been found in Uzbekistan from 1950 to 2020. This work is based on 790 fungal occurrence records: 185 from recently collected specimens, 101 from herbarium specimens made by earlier collectors, and 504 from literature-based records. All data were deposited as a species occurrence record dataset in the Global Biodiversity Information Facility and also summarized in the form of an annotated checklist in this paper. All 286 available specimens were morphologically examined. For 138 specimens, the 114 ITS and 85 LSU nrDNA sequences were newly sequenced and used for phylogenetic analysis. In total, we confirm the presence of 153 species of wood-inhabiting poroid and corticioid fungi in Uzbekistan, of which 31 species are reported for the first time in Uzbekistan, including 19 that are also new to Central Asia. These 153 fungal species inhabit 100 host species from 42 genera of 23 families. Polyporales and Hymenochaetales are the most recorded fungal orders and are most widely distributed around the study area. This study provides the first comprehensively updated and annotated the checklist of wood-inhabiting poroid and corticioid fungi in Uzbekistan. Such study should be expanded to other countries to further clarify species diversity of wood-inhabiting fungi around Central Asia.

13.
Microorganisms ; 8(12)2020 Nov 30.
Article in English | MEDLINE | ID: mdl-33266327

ABSTRACT

Here, we describe the taxon hypothesis (TH) paradigm, which covers the construction, identification, and communication of taxa as datasets. Defining taxa as datasets of individuals and their traits will make taxon identification and most importantly communication of taxa precise and reproducible. This will allow datasets with standardized and atomized traits to be used digitally in identification pipelines and communicated through persistent identifiers. Such datasets are particularly useful in the context of formally undescribed or even physically undiscovered species if data such as sequences from samples of environmental DNA (eDNA) are available. Implementing the TH paradigm will to some extent remove the impediment to hastily discover and formally describe all extant species in that the TH paradigm allows discovery and communication of new species and other taxa also in the absence of formal descriptions. The TH datasets can be connected to a taxonomic backbone providing access to the vast information associated with the tree of life. In parallel to the description of the TH paradigm, we demonstrate how it is implemented in the UNITE digital taxon communication system. UNITE TH datasets include rich data on individuals and their rDNA ITS sequences. These datasets are equipped with digital object identifiers (DOI) that serve to fix their identity in our communication. All datasets are also connected to a GBIF taxonomic backbone. Researchers processing their eDNA samples using UNITE datasets will, thus, be able to publish their findings as taxon occurrences in the GBIF data portal. UNITE species hypothesis (species level THs) datasets are increasingly utilized in taxon identification pipelines and even formally undescribed species can be identified and communicated by using UNITE. The TH paradigm seeks to achieve unambiguous, unique, and traceable communication of taxa and their properties at any level of the tree of life. It offers a rapid way to discover and communicate undescribed species in identification pipelines and data portals before they are lost to the sixth mass extinction.

14.
Sci Data ; 6(1): 40, 2019 04 25.
Article in English | MEDLINE | ID: mdl-31024009

ABSTRACT

Arthropods play a dominant role in natural and human-modified terrestrial ecosystem dynamics. Spatially-explicit arthropod population time-series data are crucial for statistical or mathematical models of these dynamics and assessment of their veterinary, medical, agricultural, and ecological impacts. Such data have been collected world-wide for over a century, but remain scattered and largely inaccessible. In particular, with the ever-present and growing threat of arthropod pests and vectors of infectious diseases, there are numerous historical and ongoing surveillance efforts, but the data are not reported in consistent formats and typically lack sufficient metadata to make reuse and re-analysis possible. Here, we present the first-ever minimum information standard for arthropod abundance, Minimum Information for Reusable Arthropod Abundance Data (MIReAD). Developed with broad stakeholder collaboration, it balances sufficiency for reuse with the practicality of preparing the data for submission. It is designed to optimize data (re)usability from the "FAIR," (Findable, Accessible, Interoperable, and Reusable) principles of public data archiving (PDA). This standard will facilitate data unification across research initiatives and communities dedicated to surveillance for detection and control of vector-borne diseases and pests.


Subject(s)
Arthropods , Information Storage and Retrieval/standards , Animals , Arthropods/physiology , Biodiversity , Ecosystem , Information Dissemination , Population Dynamics
15.
Proc Natl Acad Sci U S A ; 116(19): 9658-9664, 2019 05 07.
Article in English | MEDLINE | ID: mdl-31004061

ABSTRACT

Biodiversity loss is a major challenge. Over the past century, the average rate of vertebrate extinction has been about 100-fold higher than the estimated background rate and population declines continue to increase globally. Birth and death rates determine the pace of population increase or decline, thus driving the expansion or extinction of a species. Design of species conservation policies hence depends on demographic data (e.g., for extinction risk assessments or estimation of harvesting quotas). However, an overview of the accessible data, even for better known taxa, is lacking. Here, we present the Demographic Species Knowledge Index, which classifies the available information for 32,144 (97%) of extant described mammals, birds, reptiles, and amphibians. We show that only 1.3% of the tetrapod species have comprehensive information on birth and death rates. We found no demographic measures, not even crude ones such as maximum life span or typical litter/clutch size, for 65% of threatened tetrapods. More field studies are needed; however, some progress can be made by digitalizing existing knowledge, by imputing data from related species with similar life histories, and by using information from captive populations. We show that data from zoos and aquariums in the Species360 network can significantly improve knowledge for an almost eightfold gain. Assessing the landscape of limited demographic knowledge is essential to prioritize ways to fill data gaps. Such information is urgently needed to implement management strategies to conserve at-risk taxa and to discover new unifying concepts and evolutionary relationships across thousands of tetrapod species.


Subject(s)
Biodiversity , Biological Evolution , Conservation of Natural Resources , Extinction, Biological , Vertebrates/physiology , Animals
16.
Biodivers Data J ; 7: e33679, 2019.
Article in English | MEDLINE | ID: mdl-30886531

ABSTRACT

There has been major progress over the last two decades in digitising historical knowledge of biodiversity and in making biodiversity data freely and openly accessible. Interlocking efforts bring together international partnerships and networks, national, regional and institutional projects and investments and countless individual contributors, spanning diverse biological and environmental research domains, government agencies and non-governmental organisations, citizen science and commercial enterprise. However, current efforts remain inefficient and inadequate to address the global need for accurate data on the world's species and on changing patterns and trends in biodiversity. Significant challenges include imbalances in regional engagement in biodiversity informatics activity, uneven progress in data mobilisation and sharing, the lack of stable persistent identifiers for data records, redundant and incompatible processes for cleaning and interpreting data and the absence of functional mechanisms for knowledgeable experts to curate and improve data. Recognising the need for greater alignment between efforts at all scales, the Global Biodiversity Information Facility (GBIF) convened the second Global Biodiversity Informatics Conference (GBIC2) in July 2018 to propose a coordination mechanism for developing shared roadmaps for biodiversity informatics. GBIC2 attendees reached consensus on the need for a global alliance for biodiversity knowledge, learning from examples such as the Global Alliance for Genomics and Health (GA4GH) and the open software communities under the Apache Software Foundation. These initiatives provide models for multiple stakeholders with decentralised funding and independent governance to combine resources and develop sustainable solutions that address common needs. This paper summarises the GBIC2 discussions and presents a set of 23 complementary ambitions to be addressed by the global community in the context of the proposed alliance. The authors call on all who are responsible for describing and monitoring natural systems, all who depend on biodiversity data for research, policy or sustainable environmental management and all who are involved in developing biodiversity informatics solutions to register interest at https://biodiversityinformatics.org/ and to participate in the next steps to establishing a collaborative alliance. The supplementary materials include brochures in a number of languages (English, Arabic, Spanish, Basque, French, Japanese, Dutch, Portuguese, Russian, Traditional Chinese and Simplified Chinese). These summarise the need for an alliance for biodiversity knowledge and call for collaboration in its establishment.

17.
Nucleic Acids Res ; 47(D1): D259-D264, 2019 01 08.
Article in English | MEDLINE | ID: mdl-30371820

ABSTRACT

UNITE (https://unite.ut.ee/) is a web-based database and sequence management environment for the molecular identification of fungi. It targets the formal fungal barcode-the nuclear ribosomal internal transcribed spacer (ITS) region-and offers all ∼1 000 000 public fungal ITS sequences for reference. These are clustered into ∼459 000 species hypotheses and assigned digital object identifiers (DOIs) to promote unambiguous reference across studies. In-house and web-based third-party sequence curation and annotation have resulted in more than 275 000 improvements to the data over the past 15 years. UNITE serves as a data provider for a range of metabarcoding software pipelines and regularly exchanges data with all major fungal sequence databases and other community resources. Recent improvements include redesigned handling of unclassifiable species hypotheses, integration with the taxonomic backbone of the Global Biodiversity Information Facility, and support for an unlimited number of parallel taxonomic classification systems.


Subject(s)
Computational Biology/methods , DNA Barcoding, Taxonomic/methods , Databases, Nucleic Acid , Fungi/classification , Fungi/genetics , Genome, Fungal , Genomics , Genomics/methods , Software , Web Browser
18.
Nat Ecol Evol ; 2(10): 1531-1540, 2018 10.
Article in English | MEDLINE | ID: mdl-30224814

ABSTRACT

Essential Biodiversity Variables (EBVs) allow observation and reporting of global biodiversity change, but a detailed framework for the empirical derivation of specific EBVs has yet to be developed. Here, we re-examine and refine the previous candidate set of species traits EBVs and show how traits related to phenology, morphology, reproduction, physiology and movement can contribute to EBV operationalization. The selected EBVs express intra-specific trait variation and allow monitoring of how organisms respond to global change. We evaluate the societal relevance of species traits EBVs for policy targets and demonstrate how open, interoperable and machine-readable trait data enable the building of EBV data products. We outline collection methods, meta(data) standardization, reproducible workflows, semantic tools and licence requirements for producing species traits EBVs. An operationalization is critical for assessing progress towards biodiversity conservation and sustainable development goals and has wide implications for data-intensive science in ecology, biogeography, conservation and Earth observation.


Subject(s)
Biodiversity , Conservation of Natural Resources/methods , Invertebrates , Life History Traits , Plants , Vertebrates , Animals
19.
Sci Data ; 5: 170202, 2018 01 23.
Article in English | MEDLINE | ID: mdl-29360103

ABSTRACT

Harmonised, representative data on the state of biological invasions remain inadequate at country and global scales, particularly for taxa that affect biodiversity and ecosystems. Information is not readily available in a form suitable for policy and reporting. The Global Register of Introduced and Invasive Species (GRIIS) provides the first country-wise checklists of introduced (naturalised) and invasive species. GRIIS was conceived to provide a sustainable platform for information delivery to support national governments. We outline the rationale and methods underpinning GRIIS, to facilitate transparent, repeatable analysis and reporting. Twenty country checklists are presented as exemplars; GRIIS Checklists for close to all countries globally will be submitted through the same process shortly. Over 11000 species records are currently in the 20 country exemplars alone, with environmental impact evidence for just over 20% of these. GRIIS provides significant support for countries to identify and prioritise invasive alien species, and establishes national and global baselines. In future this will enable a global system for sustainable monitoring of trends in biological invasions that affect the environment.


Subject(s)
Ecosystem , Introduced Species , Animals , Biodiversity , Ecological Parameter Monitoring
20.
Biol Rev Camb Philos Soc ; 93(1): 600-625, 2018 02.
Article in English | MEDLINE | ID: mdl-28766908

ABSTRACT

Much biodiversity data is collected worldwide, but it remains challenging to assemble the scattered knowledge for assessing biodiversity status and trends. The concept of Essential Biodiversity Variables (EBVs) was introduced to structure biodiversity monitoring globally, and to harmonize and standardize biodiversity data from disparate sources to capture a minimum set of critical variables required to study, report and manage biodiversity change. Here, we assess the challenges of a 'Big Data' approach to building global EBV data products across taxa and spatiotemporal scales, focusing on species distribution and abundance. The majority of currently available data on species distributions derives from incidentally reported observations or from surveys where presence-only or presence-absence data are sampled repeatedly with standardized protocols. Most abundance data come from opportunistic population counts or from population time series using standardized protocols (e.g. repeated surveys of the same population from single or multiple sites). Enormous complexity exists in integrating these heterogeneous, multi-source data sets across space, time, taxa and different sampling methods. Integration of such data into global EBV data products requires correcting biases introduced by imperfect detection and varying sampling effort, dealing with different spatial resolution and extents, harmonizing measurement units from different data sources or sampling methods, applying statistical tools and models for spatial inter- or extrapolation, and quantifying sources of uncertainty and errors in data and models. To support the development of EBVs by the Group on Earth Observations Biodiversity Observation Network (GEO BON), we identify 11 key workflow steps that will operationalize the process of building EBV data products within and across research infrastructures worldwide. These workflow steps take multiple sequential activities into account, including identification and aggregation of various raw data sources, data quality control, taxonomic name matching and statistical modelling of integrated data. We illustrate these steps with concrete examples from existing citizen science and professional monitoring projects, including eBird, the Tropical Ecology Assessment and Monitoring network, the Living Planet Index and the Baltic Sea zooplankton monitoring. The identified workflow steps are applicable to both terrestrial and aquatic systems and a broad range of spatial, temporal and taxonomic scales. They depend on clear, findable and accessible metadata, and we provide an overview of current data and metadata standards. Several challenges remain to be solved for building global EBV data products: (i) developing tools and models for combining heterogeneous, multi-source data sets and filling data gaps in geographic, temporal and taxonomic coverage, (ii) integrating emerging methods and technologies for data collection such as citizen science, sensor networks, DNA-based techniques and satellite remote sensing, (iii) solving major technical issues related to data product structure, data storage, execution of workflows and the production process/cycle as well as approaching technical interoperability among research infrastructures, (iv) allowing semantic interoperability by developing and adopting standards and tools for capturing consistent data and metadata, and (v) ensuring legal interoperability by endorsing open data or data that are free from restrictions on use, modification and sharing. Addressing these challenges is critical for biodiversity research and for assessing progress towards conservation policy targets and sustainable development goals.


Subject(s)
Animal Distribution/physiology , Biodiversity , Environmental Monitoring/methods , Animals , Models, Biological
SELECTION OF CITATIONS
SEARCH DETAIL
...