Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
Add more filters










Database
Language
Publication year range
1.
J Anim Ecol ; 93(2): 147-158, 2024 02.
Article in English | MEDLINE | ID: mdl-38230868

ABSTRACT

Classifying specimens is a critical component of ecological research, biodiversity monitoring and conservation. However, manual classification can be prohibitively time-consuming and expensive, limiting how much data a project can afford to process. Computer vision, a form of machine learning, can help overcome these problems by rapidly, automatically and accurately classifying images of specimens. Given the diversity of animal species and contexts in which images are captured, there is no universal classifier for all species and use cases. As such, ecologists often need to train their own models. While numerous software programs exist to support this process, ecologists need a fundamental understanding of how computer vision works to select appropriate model workflows based on their specific use case, data types, computing resources and desired performance capabilities. Ecologists may also face characteristic quirks of ecological datasets, such as long-tail distributions, 'unknown' species, similarity between species and polymorphism within species, which impact the efficacy of computer vision. Despite growing interest in computer vision for ecology, there are few resources available to help ecologists face the challenges they are likely to encounter. Here, we present a gentle introduction for species classification using computer vision. In this manuscript and associated GitHub repository, we demonstrate how to prepare training data, basic model training procedures, and methods for model evaluation and selection. Throughout, we explore specific considerations ecologists should make when training classification models, such as data domains, feature extractors and class imbalances. With these basics, ecologists can adjust their workflows to achieve research goals and/or account for uncertainty in downstream analysis. Our goal is to provide guidance for ecologists for getting started in or improving their use of machine learning for visual classification tasks.


Subject(s)
Computers , Neural Networks, Computer , Animals , Machine Learning , Biodiversity
2.
PLoS One ; 18(7): e0288415, 2023.
Article in English | MEDLINE | ID: mdl-37440520

ABSTRACT

Allochronic speciation, where reproductive isolation between populations of a species is facilitated by a difference in reproductive timing, depends on abiotic factors such as seasonality and biotic factors such as diapause intensity. These factors are strongly influenced by latitudinal trends in climate, so we hypothesized that there is a relationship between latitude and divergence among populations separated by life history timing. Hyphantria cunea (the fall webworm), a lepidopteran defoliator with red and black colour morphs, is hypothesized to be experiencing an incipient allochronic speciation. However, given their broad geographic range, the strength of allochronic speciation may vary across latitude. We annotated >11,000 crowd-sourced observations of fall webworm to model geographic distribution, phenology, and differences in colour phenotype between morphs across North America. We found that red and black morph life history timing differs across North America, and the phenology of morphs diverges more in warmer climates at lower latitudes. We also found some evidence that the colour phenotype of morphs also diverges at lower latitudes, suggesting reduced gene flow between colour morphs. Our results demonstrate that seasonality in lower latitudes may increase the strength of allochronic speciation in insects, and that the strength of sympatric speciation can vary along a latitudinal gradient. This has implications for our understanding of broad-scale speciation events and trends in global biodiversity.


Subject(s)
Crowdsourcing , Moths , Animals , Moths/genetics , Climate , Biodiversity , North America , Genetic Speciation
3.
Ecol Evol ; 10(23): 13143-13153, 2020 Dec.
Article in English | MEDLINE | ID: mdl-33304524

ABSTRACT

Insect populations are changing rapidly, and monitoring these changes is essential for understanding the causes and consequences of such shifts. However, large-scale insect identification projects are time-consuming and expensive when done solely by human identifiers. Machine learning offers a possible solution to help collect insect data quickly and efficiently.Here, we outline a methodology for training classification models to identify pitfall trap-collected insects from image data and then apply the method to identify ground beetles (Carabidae). All beetles were collected by the National Ecological Observatory Network (NEON), a continental scale ecological monitoring project with sites across the United States. We describe the procedures for image collection, image data extraction, data preparation, and model training, and compare the performance of five machine learning algorithms and two classification methods (hierarchical vs. single-level) identifying ground beetles from the species to subfamily level. All models were trained using pre-extracted feature vectors, not raw image data. Our methodology allows for data to be extracted from multiple individuals within the same image thus enhancing time efficiency, utilizes relatively simple models that allow for direct assessment of model performance, and can be performed on relatively small datasets.The best performing algorithm, linear discriminant analysis (LDA), reached an accuracy of 84.6% at the species level when naively identifying species, which was further increased to >95% when classifications were limited by known local species pools. Model performance was negatively correlated with taxonomic specificity, with the LDA model reaching an accuracy of ~99% at the subfamily level. When classifying carabid species not included in the training dataset at higher taxonomic levels species, the models performed significantly better than if classifications were made randomly. We also observed greater performance when classifications were made using the hierarchical classification method compared to the single-level classification method at higher taxonomic levels.The general methodology outlined here serves as a proof-of-concept for classifying pitfall trap-collected organisms using machine learning algorithms, and the image data extraction methodology may be used for nonmachine learning uses. We propose that integration of machine learning in large-scale identification pipelines will increase efficiency and lead to a greater flow of insect macroecological data, with the potential to be expanded for use with other noninsect taxa.

4.
Biodivers Data J ; 8: e32765, 2020.
Article in English | MEDLINE | ID: mdl-32269475

ABSTRACT

Biodiversity informatics depends on digital access to credible information about species. Many online resources host species' data, but the lack of categorisation for these resources inhibits the growth of this entire field. To explore possible solutions, we examined the (now retired) Biodiversity Information Projects of the World (BIPW) dataset created by the Biodiversity Information Standards (TDWG); this project, which ran from 2007-2015 (officially removed from the TDWG website in 2018) was an attempt at organising the Web's biodiversity databases into an indexed list. To do this, we applied a simple classification scheme to score databases within BIPW based on nine data categories, to characterise trends and current compositions of this biodiversity e-infrastructure. Primarily, we found that of 600 databases investigated from BIPW, only 315 (~53%) were accessible at the time of this writing, underscoring the precarious nature of the biodiversity information landscape. Many of these databases are still available, but suffer accessibility issues such as link rot, thus putting the information they contain in danger of being lost. We propose that a community-driven database of biodiversity databases with an accompanying ontology could facilitate efficient discovery of relevant biodiversity databases and support smaller databases - which have the greatest risk of being lost.

SELECTION OF CITATIONS
SEARCH DETAIL
...