Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
Gigascience ; 112022 08 11.
Article in English | MEDLINE | ID: mdl-35950838

ABSTRACT

Metagenomics is a culture-independent method for studying the microbes inhabiting a particular environment. Comparing the composition of samples (functionally/taxonomically), either from a longitudinal study or cross-sectional studies, can provide clues into how the microbiota has adapted to the environment. However, a recurring challenge, especially when comparing results between independent studies, is that key metadata about the sample and molecular methods used to extract and sequence the genetic material are often missing from sequence records, making it difficult to account for confounding factors. Nevertheless, these missing metadata may be found in the narrative of publications describing the research. Here, we describe a machine learning framework that automatically extracts essential metadata for a wide range of metagenomics studies from the literature contained in Europe PMC. This framework has enabled the extraction of metadata from 114,099 publications in Europe PMC, including 19,900 publications describing metagenomics studies in European Nucleotide Archive (ENA) and MGnify. Using this framework, a new metagenomics annotations pipeline was developed and integrated into Europe PMC to regularly enrich up-to-date ENA and MGnify metagenomics studies with metadata extracted from research articles. These metadata are now available for researchers to explore and retrieve in the MGnify and Europe PMC websites, as well as Europe PMC annotations API.


Subject(s)
Metadata , Metagenomics , Access to Information , Cross-Sectional Studies , Longitudinal Studies , Machine Learning , Metagenomics/methods
2.
Nucleic Acids Res ; 49(D1): D1507-D1514, 2021 01 08.
Article in English | MEDLINE | ID: mdl-33180112

ABSTRACT

Europe PMC (https://europepmc.org) is a database of research articles, including peer reviewed full text articles and abstracts, and preprints - all freely available for use via website, APIs and bulk download. This article outlines new developments since 2017 where work has focussed on three key areas: (i) Europe PMC has added to its core content to include life science preprint abstracts and a special collection of full text of COVID-19-related preprints. Europe PMC is unique as an aggregator of biomedical preprints alongside peer-reviewed articles, with over 180 000 preprints available to search. (ii) Europe PMC has significantly expanded its links to content related to the publications, such as links to Unpaywall, providing wider access to full text, preprint peer-review platforms, all major curated data resources in the life sciences, and experimental protocols. The redesigned Europe PMC website features the PubMed abstract and corresponding PMC full text merged into one article page; there is more evident and user-friendly navigation within articles and to related content, plus a figure browse feature. (iii) The expanded annotations platform offers ∼1.3 billion text mined biological terms and concepts sourced from 10 providers and over 40 global data resources.


Subject(s)
Biological Science Disciplines/statistics & numerical data , COVID-19/prevention & control , Data Curation/statistics & numerical data , Data Mining/statistics & numerical data , Databases, Factual/statistics & numerical data , PubMed , SARS-CoV-2/isolation & purification , Biological Science Disciplines/methods , Biomedical Research/methods , Biomedical Research/statistics & numerical data , COVID-19/epidemiology , COVID-19/virology , Data Curation/methods , Data Mining/methods , Epidemics , Europe , Humans , Internet , SARS-CoV-2/physiology
3.
Acta Crystallogr F Struct Biol Commun ; 75(Pt 11): 665-672, 2019 Nov 01.
Article in English | MEDLINE | ID: mdl-31702580

ABSTRACT

This work presents an annotation tool that automatically locates mentions of particular amino-acid residues in published papers and identifies the protein concerned. These matches can be provided in context or in a searchable format in order for researchers to better use the existing and future literature.


Subject(s)
Molecular Sequence Annotation , Proteins/chemistry , Publications , Amino Acids/chemistry , Automation , Mutation/genetics , Proteins/genetics , Software
4.
Nucleic Acids Res ; 46(D1): D1254-D1260, 2018 01 04.
Article in English | MEDLINE | ID: mdl-29161421

ABSTRACT

Europe PMC (https://europepmc.org) is a comprehensive resource of biomedical research publications that offers advanced tools for search, retrieval, and interaction with the scientific literature. This article outlines new developments since 2014. In addition to delivering the core database and services, Europe PMC focuses on three areas of development: individual user services, data integration, and infrastructure to support text and data mining. Europe PMC now provides user accounts to save search queries and claim publications to ORCIDs, as well as open access profiles for authors based on public ORCID records. We continue to foster connections between scientific data and literature in a number of ways. All the data behind the paper - whether in structured archives, generic archives or as supplemental files - are now available via links to the BioStudies database. Text-mined biological concepts, including database accession numbers and data DOIs, are highlighted in the text and linked to the appropriate data resources. The SciLite community annotation platform accepts text-mining results from various contributors and overlays them on research articles as licence allows. In addition, text miners and developers can access all open content via APIs or via the FTP site.


Subject(s)
Biomedical Research , Databases, Bibliographic , Data Mining , Internet , Serial Publications , User-Computer Interface
5.
Wellcome Open Res ; 1: 25, 2016.
Article in English | MEDLINE | ID: mdl-28948232

ABSTRACT

The tremendous growth in biological data has resulted in an increase in the number of research papers being published. This presents a great challenge for scientists in searching and assimilating facts described in those papers. Particularly, biological databases depend on curators to add highly precise and useful information that are usually extracted by reading research articles. Therefore, there is an urgent need to find ways to improve linking literature to the underlying data, thereby minimising the effort in browsing content and identifying key biological concepts.   As part of the development of Europe PMC, we have developed a new platform, SciLite, which integrates text-mined annotations from different sources and overlays those outputs on research articles. The aim is to aid researchers and curators using Europe PMC in finding key concepts more easily and provide links to related resources or tools, bridging the gap between literature and biological data.

6.
J Biomed Semantics ; 6: 7, 2015.
Article in English | MEDLINE | ID: mdl-25774284

ABSTRACT

BACKGROUND: As the availability of open access full text research articles increases, so does the need for sophisticated search services that make the most of this new content. Here, we present a new feature available in Europe PMC that allows selected sections of full text articles to be searched, including figures and reference lists. Users can now search particular parts of an article, reducing noise and allowing fine-tuning of searches. RESULTS: To the best of our knowledge, Europe PMC is the first service that provides a granular literature search by allowing users to target their search to particular sections of articles. This new functionality is based on a heuristic algorithm that identifies and categorises article sections into 17 pre-defined categories based on the section heading. The tagger's performance is measured against a manually curated dataset consisting of 100 full text articles with an F-score of 98.02%. CONCLUSIONS: The section search is available from the advanced search within Europe PMC (http://europepmc.org). The source code is freely available from http://europepmc.org/ftp/oa/SectionTagger/.

SELECTION OF CITATIONS
SEARCH DETAIL
...