Search | VHL Regional Portal

Enhancing DNA barcode reference libraries by harvesting terrestrial arthropods at the Smithsonian's National Museum of Natural History.

Santos, Bernardo F; Miller, Meredith E; Miklasevskaja, Margarita; McKeown, Jaclyn T A; Redmond, Niamh E; Coddington, Jonathan A; Bird, Jessica; Miller, Scott E; Smith, Ashton; Brady, Seán G; Buffington, Matthew L; Chamorro, M Lourdes; Dikow, Torsten; Gates, Michael W; Goldstein, Paul; Konstantinov, Alexander; Kula, Robert; Silverson, Nicholas D; Solis, M Alma; deWaard, Stephanie L; Naik, Suresh; Nikolova, Nadya; Pentinsaari, Mikko; Prosser, Sean W J; Sones, Jayme E; Zakharov, Evgeny V; deWaard, Jeremy R.

Biodivers Data J ; 11: e100904, 2023.

Article in English | MEDLINE | ID: mdl-38327288

ABSTRACT

The use of DNA barcoding has revolutionised biodiversity science, but its application depends on the existence of comprehensive and reliable reference libraries. For many poorly known taxa, such reference sequences are missing even at higher-level taxonomic scales. We harvested the collections of the Smithsonian's National Museum of Natural History (USNM) to generate DNA barcoding sequences for genera of terrestrial arthropods previously not recorded in one or more major public sequence databases. Our workflow used a mix of Sanger and Next-Generation Sequencing (NGS) approaches to maximise sequence recovery while ensuring affordable cost. In total, COI sequences were obtained for 5,686 specimens belonging to 3,737 determined species in 3,886 genera and 205 families distributed in 137 countries. Success rates varied widely according to collection data and focal taxon. NGS helped recover sequences of specimens that failed a previous run of Sanger sequencing. Success rates and the optimal balance between Sanger and NGS are the most important drivers to maximise output and minimise cost in future projects. The corresponding sequence and taxonomic data can be accessed through the Barcode of Life Data System, GenBank, the Global Biodiversity Information Facility, the Global Genome Biodiversity Network Data Portal and the NMNH data portal.

A workflow for expanding DNA barcode reference libraries through 'museum harvesting' of natural history collections.

Levesque-Beaudin, Valerie; Miller, Meredith E; Dikow, Torsten; Miller, Scott E; Prosser, Sean W J; Zakharov, Evgeny V; McKeown, Jaclyn T A; Sones, Jayme E; Redmond, Niamh E; Coddington, Jonathan A; Santos, Bernardo F; Bird, Jessica; deWaard, Jeremy R.

Biodivers Data J ; 11: e100677, 2023.

Article in English | MEDLINE | ID: mdl-38327333

ABSTRACT

Natural history collections are the physical repositories of our knowledge on species, the entities of biodiversity. Making this knowledge accessible to society - through, for example, digitisation or the construction of a validated, global DNA barcode library - is of crucial importance. To this end, we developed and streamlined a workflow for 'museum harvesting' of authoritatively identified Diptera specimens from the Smithsonian Institution's National Museum of Natural History. Our detailed workflow includes both on-site and off-site processing through specimen selection, labelling, imaging, tissue sampling, databasing and DNA barcoding. This approach was tested by harvesting and DNA barcoding 941 voucher specimens, representing 32 families, 819 genera and 695 identified species collected from 100 countries. We recovered 867 sequences (> 0 base pairs) with a sequencing success of 88.8% (727 of 819 sequenced genera gained a barcode > 300 base pairs). While Sanger-based methods were more effective for recently-collected specimens, the methods employing next-generation sequencing recovered barcodes for specimens over a century old. The utility of the newly-generated reference barcodes is demonstrated by the subsequent taxonomic assignment of nearly 5000 specimen records in the Barcode of Life Data Systems.

A reference library for Canadian invertebrates with 1.5 million barcodes, voucher specimens, and DNA samples.

deWaard, Jeremy R; Ratnasingham, Sujeevan; Zakharov, Evgeny V; Borisenko, Alex V; Steinke, Dirk; Telfer, Angela C; Perez, Kate H J; Sones, Jayme E; Young, Monica R; Levesque-Beaudin, Valerie; Sobel, Crystal N; Abrahamyan, Arusyak; Bessonov, Kyrylo; Blagoev, Gergin; deWaard, Stephanie L; Ho, Chris; Ivanova, Natalia V; Layton, Kara K S; Lu, Liuqiong; Manjunath, Ramya; McKeown, Jaclyn T A; Milton, Megan A; Miskie, Renee; Monkhouse, Norm; Naik, Suresh; Nikolova, Nadya; Pentinsaari, Mikko; Prosser, Sean W J; Radulovici, Adriana E; Steinke, Claudia; Warne, Connor P; Hebert, Paul D N.

Sci Data ; 6(1): 308, 2019 12 06.

Article in English | MEDLINE | ID: mdl-31811161

ABSTRACT

The reliable taxonomic identification of organisms through DNA sequence data requires a well parameterized library of curated reference sequences. However, it is estimated that just 15% of described animal species are represented in public sequence repositories. To begin to address this deficiency, we provide DNA barcodes for 1,500,003 animal specimens collected from 23 terrestrial and aquatic ecozones at sites across Canada, a nation that comprises 7% of the planet's land surface. In total, 14 phyla, 43 classes, 163 orders, 1123 families, 6186 genera, and 64,264 Barcode Index Numbers (BINs; a proxy for species) are represented. Species-level taxonomy was available for 38% of the specimens, but higher proportions were assigned to a genus (69.5%) and a family (99.9%). Voucher specimens and DNA extracts are archived at the Centre for Biodiversity Genomics where they are available for further research. The corresponding sequence and taxonomic data can be accessed through the Barcode of Life Data System, GenBank, the Global Biodiversity Information Facility, and the Global Genome Biodiversity Network Data Portal.

Subject(s)

DNA Barcoding, Taxonomic , Invertebrates/classification , Animals , Biodiversity , Canada

Metabarcoding a diverse arthropod mock community.

Braukmann, Thomas W A; Ivanova, Natalia V; Prosser, Sean W J; Elbrecht, Vasco; Steinke, Dirk; Ratnasingham, Sujeevan; de Waard, Jeremy R; Sones, Jayme E; Zakharov, Evgeny V; Hebert, Paul D N.

Mol Ecol Resour ; 19(3): 711-727, 2019 May.

Article in English | MEDLINE | ID: mdl-30779309

ABSTRACT

Although DNA metabarcoding is an attractive approach for monitoring biodiversity, it is often difficult to detect all the species present in a bulk sample. In particular, sequence recovery for a given species depends on its biomass and mitome copy number as well as the primer set employed for PCR. To examine these variables, we constructed a mock community of terrestrial arthropods comprised of 374 species. We used this community to examine how species recovery was impacted when amplicon pools were constructed in four ways. The first two protocols involved the construction of bulk DNA extracts from different body segments (Bulk Abdomen, Bulk Leg). The other protocols involved the production of DNA extracts from single legs which were then merged prior to PCR (Composite Leg) or PCR-amplified separately (Single Leg) and then pooled. The amplicons generated by these four treatments were then sequenced on three platforms (Illumina MiSeq, Ion Torrent PGM and Ion Torrent S5). The choice of sequencing platform did not substantially influence species recovery, although the Miseq delivered the highest sequence quality. As expected, species recovery was most efficient from the Single Leg treatment because amplicon abundance varied little among taxa. Among the three treatments where PCR occurred after pooling, the Bulk Abdomen treatment produced a more uniform read abundance than the Bulk Leg or Composite Leg treatment. Primer choice also influenced species recovery and evenness. Our results reveal how variation in protocols can have substantial impacts on perceived diversity unless sequencing coverage is sufficient to reach an asymptote.

Subject(s)

Arthropods/classification , Arthropods/genetics , DNA Barcoding, Taxonomic/methods , DNA/isolation & purification , Metagenome , Animals , DNA/chemistry , DNA/genetics , Models, Theoretical , Sequence Analysis, DNA

Expedited assessment of terrestrial arthropod diversity by coupling Malaise traps with DNA barcoding ¹.

deWaard, Jeremy R; Levesque-Beaudin, Valerie; deWaard, Stephanie L; Ivanova, Natalia V; McKeown, Jaclyn T A; Miskie, Renee; Naik, Suresh; Perez, Kate H J; Ratnasingham, Sujeevan; Sobel, Crystal N; Sones, Jayme E; Steinke, Claudia; Telfer, Angela C; Young, Andrew D; Young, Monica R; Zakharov, Evgeny V; Hebert, Paul D N.

Genome ; 62(3): 85-95, 2019 Mar.

Article in English | MEDLINE | ID: mdl-30257096

ABSTRACT

Monitoring changes in terrestrial arthropod communities over space and time requires a dramatic increase in the speed and accuracy of processing samples that cannot be achieved with morphological approaches. The combination of DNA barcoding and Malaise traps allows expedited, comprehensive inventories of species abundance whose cost will rapidly decline as high-throughput sequencing technologies advance. Aside from detailing protocols from specimen sorting to data release, this paper describes their use in a survey of arthropod diversity in a national park that examined 21 194 specimens representing 2255 species. These protocols can support arthropod monitoring programs at regional, national, and continental scales.

Subject(s)

Arthropods/classification , Arthropods/genetics , Biodiversity , DNA Barcoding, Taxonomic/methods , DNA/genetics , Entomology/instrumentation , Animals , DNA/analysis , Phylogeny , Species Specificity

A Sequel to Sanger: amplicon sequencing that scales.

Hebert, Paul D N; Braukmann, Thomas W A; Prosser, Sean W J; Ratnasingham, Sujeevan; deWaard, Jeremy R; Ivanova, Natalia V; Janzen, Daniel H; Hallwachs, Winnie; Naik, Suresh; Sones, Jayme E; Zakharov, Evgeny V.

BMC Genomics ; 19(1): 219, 2018 Mar 27.

Article in English | MEDLINE | ID: mdl-29580219

ABSTRACT

BACKGROUND: Although high-throughput sequencers (HTS) have largely displaced their Sanger counterparts, the short read lengths and high error rates of most platforms constrain their utility for amplicon sequencing. The present study tests the capacity of single molecule, real-time (SMRT) sequencing implemented on the SEQUEL platform to overcome these limitations, employing 658 bp amplicons of the mitochondrial cytochrome c oxidase I gene as a model system. RESULTS: By examining templates from more than 5000 species and 20,000 specimens, the performance of SMRT sequencing was tested with amplicons showing wide variation in GC composition and varied sequence attributes. SMRT and Sanger sequences were very similar, but SMRT sequencing provided more complete coverage, especially for amplicons with homopolymer tracts. Because it can characterize amplicon pools from 10,000 DNA extracts in a single run, the SEQUEL can reduce greatly reduce sequencing costs in comparison to first (Sanger) and second generation platforms (Illumina, Ion). CONCLUSIONS: SMRT analysis generates high-fidelity sequences from amplicons with varying GC content and is resilient to homopolymer tracts. Analytical costs are low, substantially less than those for first or second generation sequencers. When implemented on the SEQUEL platform, SMRT analysis enables massive amplicon characterization because each instrument can recover sequences from more than 5 million DNA extracts a year.

Subject(s)

Arthropods/genetics , High-Throughput Nucleotide Sequencing/methods , Polymerase Chain Reaction/methods , Sequence Analysis, DNA/methods , Animals , Arthropods/classification , Genetic Variation

Biodiversity inventories in high gear: DNA barcoding facilitates a rapid biotic survey of a temperate nature reserve.

Telfer, Angela C; Young, Monica R; Quinn, Jenna; Perez, Kate; Sobel, Crystal N; Sones, Jayme E; Levesque-Beaudin, Valerie; Derbyshire, Rachael; Fernandez-Triana, Jose; Rougerie, Rodolphe; Thevanayagam, Abinah; Boskovic, Adrian; Borisenko, Alex V; Cadel, Alex; Brown, Allison; Pages, Anais; Castillo, Anibal H; Nicolai, Annegret; Glenn Mockford, Barb Mockford; Bukowski, Belén; Wilson, Bill; Trojahn, Brock; Lacroix, Carole Ann; Brimblecombe, Chris; Hay, Christoper; Ho, Christmas; Steinke, Claudia; Warne, Connor P; Garrido Cortes, Cristina; Engelking, Daniel; Wright, Danielle; Lijtmaer, Dario A; Gascoigne, David; Hernandez Martich, David; Morningstar, Derek; Neumann, Dirk; Steinke, Dirk; Marco DeBruin, Donna DeBruin; Dobias, Dylan; Sears, Elizabeth; Richard, Ellen; Damstra, Emily; Zakharov, Evgeny V; Laberge, Frederic; Collins, Gemma E; Blagoev, Gergin A; Grainge, Gerrie; Ansell, Graham; Meredith, Greg; Hogg, Ian.

Biodivers Data J ; (3): e6313, 2015.

Article in English | MEDLINE | ID: mdl-26379469

ABSTRACT

BACKGROUND: Comprehensive biotic surveys, or 'all taxon biodiversity inventories' (ATBI), have traditionally been limited in scale or scope due to the complications surrounding specimen sorting and species identification. To circumvent these issues, several ATBI projects have successfully integrated DNA barcoding into their identification procedures and witnessed acceleration in their surveys and subsequent increase in project scope and scale. The Biodiversity Institute of Ontario partnered with the rare Charitable Research Reserve and delegates of the 6th International Barcode of Life Conference to complete its own rapid, barcode-assisted ATBI of an established land trust in Cambridge, Ontario, Canada. NEW INFORMATION: The existing species inventory for the rare Charitable Research Reserve was rapidly expanded by integrating a DNA barcoding workflow with two surveying strategies - a comprehensive sampling scheme over four months, followed by a one-day bioblitz involving international taxonomic experts. The two surveys resulted in 25,287 and 3,502 specimens barcoded, respectively, as well as 127 human observations. This barcoded material, all vouchered at the Biodiversity Institute of Ontario collection, covers 14 phyla, 29 classes, 117 orders, and 531 families of animals, plants, fungi, and lichens. Overall, the ATBI documented 1,102 new species records for the nature reserve, expanding the existing long-term inventory by 49%. In addition, 2,793 distinct Barcode Index Numbers (BINs) were assigned to genus or higher level taxonomy, and represent additional species that will be added once their taxonomy is resolved. For the 3,502 specimens, the collection, sequence analysis, taxonomic assignment, data release and manuscript submission by 100+ co-authors all occurred in less than one week. This demonstrates the speed at which barcode-assisted inventories can be completed and the utility that barcoding provides in minimizing and guiding valuable taxonomic specialist time. The final product is more than a comprehensive biotic inventory - it is also a rich dataset of fine-scale occurrence and sequence data, all archived and cross-linked in the major biodiversity data repositories. This model of rapid generation and dissemination of essential biodiversity data could be followed to conduct regional assessments of biodiversity status and change, and potentially be employed for evaluating progress towards the Aichi Targets of the Strategic Plan for Biodiversity 2011-2020.

A DNA 'barcode blitz': rapid digitization and sequencing of a natural history collection.

Hebert, Paul D N; Dewaard, Jeremy R; Zakharov, Evgeny V; Prosser, Sean W J; Sones, Jayme E; McKeown, Jaclyn T A; Mantle, Beth; La Salle, John.

PLoS One ; 8(7): e68535, 2013.

Article in English | MEDLINE | ID: mdl-23874660

ABSTRACT

DNA barcoding protocols require the linkage of each sequence record to a voucher specimen that has, whenever possible, been authoritatively identified. Natural history collections would seem an ideal resource for barcode library construction, but they have never seen large-scale analysis because of concerns linked to DNA degradation. The present study examines the strength of this barrier, carrying out a comprehensive analysis of moth and butterfly (Lepidoptera) species in the Australian National Insect Collection. Protocols were developed that enabled tissue samples, specimen data, and images to be assembled rapidly. Using these methods, a five-person team processed 41,650 specimens representing 12,699 species in 14 weeks. Subsequent molecular analysis took about six months, reflecting the need for multiple rounds of PCR as sequence recovery was impacted by age, body size, and collection protocols. Despite these variables and the fact that specimens averaged 30.4 years old, barcode records were obtained from 86% of the species. In fact, one or more barcode compliant sequences (>487 bp) were recovered from virtually all species represented by five or more individuals, even when the youngest was 50 years old. By assembling specimen images, distributional data, and DNA barcode sequences on a web-accessible informatics platform, this study has greatly advanced accessibility to information on thousands of species. Moreover, much of the specimen data became publically accessible within days of its acquisition, while most sequence results saw release within three months. As such, this study reveals the speed with which DNA barcode workflows can mobilize biodiversity data, often providing the first web-accessible information for a species. These results further suggest that existing collections can enable the rapid development of a comprehensive DNA barcode library for the most diverse compartment of terrestrial biodiversity - insects.

Subject(s)

Biological Specimen Banks , DNA Barcoding, Taxonomic/methods , Insecta/classification , Libraries, Digital , Natural History/methods , Sequence Analysis, DNA/methods , Animals , Australia , Feasibility Studies , Information Storage and Retrieval/methods , Insecta/genetics , Quality Control , Sequence Analysis, DNA/standards , Specimen Handling/methods , Time Factors

The front-end logistics of DNA barcoding: challenges and prospects.

Borisenko, Alex V; Sones, Jayme E; Hebert, Paul D N.

Mol Ecol Resour ; 9 Suppl s1: 27-34, 2009 May.

Article in English | MEDLINE | ID: mdl-21564961

ABSTRACT

Building a global library of DNA barcodes will require efficient logistics of pre-laboratory specimen processing and seamless interfacing with molecular protocols. If not addressed properly, the task of aggregating specimens may become the biggest bottleneck in the analytical chain. Three years of experience in developing a collection management system to facilitate high-throughput DNA barcoding have allowed the Canadian Centre for DNA Barcoding to recognize and resolve the most common logistical obstacles. Dealing with these challenges on a larger scale will be an important step towards building a solid collection-based foundation for the international DNA barcoding effort.

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL