Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 14 de 14
Filter
Add more filters










Publication year range
1.
Database (Oxford) ; 20232023 03 31.
Article in English | MEDLINE | ID: mdl-37002680

ABSTRACT

The curation of genomic variants requires collecting evidence not only in variant knowledge bases but also in the literature. However, some variants result in no match when searched in the scientific literature. Indeed, it has been reported that a significant subset of information related to genomic variants are not reported in the full text, but only in the supplementary materials associated with a publication. In the study, we present an evaluation of the use of supplementary data (SD) to improve the retrieval of relevant scientific publications for variant curation. Our experiments show that searching SD enables to significantly increase the volume of documents retrieved for a variant, thus reducing by ∼63% the number of variants for which no match is found in the scientific literature. SD thus represent a paramount source of information for curating variants of unknown significance and should receive more attention by global research infrastructures, which maintain literature search engines. Database URL https://www.expasy.org/resources/variomes.


Subject(s)
Genomics , Search Engine , Databases, Factual
2.
Stud Health Technol Inform ; 294: 839-843, 2022 May 25.
Article in English | MEDLINE | ID: mdl-35612222

ABSTRACT

The importance of genomic data for health is rapidly growing but accessing and gathering information about variants from different sources is hindered by highly heterogeneous representations of variants, as outlined by clinical associations (AMP/ASCO/CAP) in their recommendations. To enable a smooth and effective retrieval of variant-containing documents from different resources, we developed a tool (https://goldorak.hesge.ch/synvar/) that generates for any given SNP - including variant not present in existing databases - its corresponding description at the genome, transcript and protein levels. It provides variant descriptions in the HGVS format as well as in many non-standard formats found in the literature along with database identifiers. We present the SynVar service and evaluate its impact on the recall of a genomic variant curation-support service. Using SynVar to search variants in the literature enables to increase the recall by +133.8% without a strong impact on precision (i.e. 93%).


Subject(s)
Genomics , Databases, Factual
3.
Bioinformatics ; 38(9): 2595-2601, 2022 04 28.
Article in English | MEDLINE | ID: mdl-35274687

ABSTRACT

MOTIVATION: Identification and interpretation of clinically actionable variants is a critical bottleneck. Searching for evidence in the literature is mandatory according to ASCO/AMP/CAP practice guidelines; however, it is both labor-intensive and error-prone. We developed a system to perform triage of publications relevant to support an evidence-based decision. The system is also able to prioritize variants. Our system searches within pre-annotated collections such as MEDLINE and PubMed Central. RESULTS: We assess the search effectiveness of the system using three different experimental settings: literature triage; variant prioritization and comparison of Variomes with LitVar. Almost two-thirds of the publications returned in the top-5 are relevant for clinical decision-support. Our approach enabled identifying 81.8% of clinically actionable variants in the top-3. Variomes retrieves on average +21.3% more articles than LitVar and returns the same number of results or more results than LitVar for 90% of the queries when tested on a set of 803 queries; thus, establishing a new baseline for searching the literature about variants. AVAILABILITY AND IMPLEMENTATION: Variomes is publicly available at https://candy.hesge.ch/Variomes. Source code is freely available at https://github.com/variomes/sibtm-variomes. SynVar is publicly available at https://goldorak.hesge.ch/synvar. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Genomics , Search Engine , Genomics/methods , Genome , PubMed , Software
4.
Stud Health Technol Inform ; 270: 884-888, 2020 Jun 16.
Article in English | MEDLINE | ID: mdl-32570509

ABSTRACT

The Swiss Variant Interpretation Platform for Oncology is a centralized, joint and curated database for clinical somatic variants piloted by a board of Swiss healthcare institutions and operated by the SIB Swiss Institute of Bioinformatics. To support this effort, SIB Text Mining designed a set of text analytics services. This report focuses on three of those services. First, the automatic annotations of the literature with a set of terminologies have been performed, resulting in a large annotated version of MEDLINE and PMC. Second, a generator of variant synonyms for single nucleotide variants has been developed using publicly available data resources, as well as patterns of non-standard formats, often found in the literature. Third, a literature ranking service enables to retrieve a ranked set of MEDLINE abstracts given a variant and optionally a diagnosis. The annotation of MEDLINE and PMC resulted in a total of respectively 785,181,199 and 1,156,060,212 annotations, which means an average of 26 and 425 annotations per abstract and full-text article. The generator of variant synonyms enables to retrieve up to 42 synonyms for a variant. The literature ranking service reaches a precision (P10) of 63%, which means that almost two-thirds of the top-10 returned abstracts are judged relevant. Further services will be implemented to complete this set of services, such as a service to retrieve relevant clinical trials for a patient and a literature ranking service for full-text articles.


Subject(s)
Computational Biology , Data Mining , Abstracting and Indexing , Humans , MEDLINE , Switzerland
5.
Nucleic Acids Res ; 48(W1): W12-W16, 2020 07 02.
Article in English | MEDLINE | ID: mdl-32379317

ABSTRACT

Thanks to recent efforts by the text mining community, biocurators have now access to plenty of good tools and Web interfaces for identifying and visualizing biomedical entities in literature. Yet, many of these systems start with a PubMed query, which is limited by strong Boolean constraints. Some semantic search engines exploit entities for Information Retrieval, and/or deliver relevance-based ranked results. Yet, they are not designed for supporting a specific curation workflow, and allow very limited control on the search process. The Swiss Institute of Bioinformatics Literature Services (SIBiLS) provide personalized Information Retrieval in the biological literature. Indeed, SIBiLS allow fully customizable search in semantically enriched contents, based on keywords and/or mapped biomedical entities from a growing set of standardized and legacy vocabularies. The services have been used and favourably evaluated to assist the curation of genes and gene products, by delivering customized literature triage engines to different curation teams. SIBiLS (https://candy.hesge.ch/SIBiLS) are freely accessible via REST APIs and are ready to empower any curation workflow, built on modern technologies scalable with big data: MongoDB and Elasticsearch. They cover MEDLINE and PubMed Central Open Access enriched by nearly 2 billion of mapped biomedical entities, and are daily updated.


Subject(s)
Data Mining/methods , Search Engine , MEDLINE , Precision Medicine
6.
Bioinformatics ; 36(10): 3244-3245, 2020 05 01.
Article in English | MEDLINE | ID: mdl-31985787

ABSTRACT

SUMMARY: The Feature-Viewer is a lightweight library for the visualization of biological data mapped to a protein or nucleotide sequence. It is designed for ease of use while allowing for a full customization. The library is already used by several biological data resources and allows intuitive visual mapping of a full spectra of sequence features for different usages. AVAILABILITY AND IMPLEMENTATION: The Feature-Viewer is open source, compatible with state-of-the-art development technologies and responsive, also for mobile viewing. Documentation and usage examples are available online.


Subject(s)
Computers , Software
7.
Nucleic Acids Res ; 48(D1): D328-D334, 2020 01 08.
Article in English | MEDLINE | ID: mdl-31724716

ABSTRACT

The neXtProt knowledgebase (https://www.nextprot.org) is an integrative resource providing both data on human protein and the tools to explore these. In order to provide comprehensive and up-to-date data, we evaluate and add new data sets. We describe the incorporation of three new data sets that provide expression, function, protein-protein binary interaction, post-translational modifications (PTM) and variant information. New SPARQL query examples illustrating uses of the new data were added. neXtProt has continued to develop tools for proteomics. We have improved the peptide uniqueness checker and have implemented a new protein digestion tool. Together, these tools make it possible to determine which proteases can be used to identify trypsin-resistant proteins by mass spectrometry. In terms of usability, we have finished revamping our web interface and completely rewritten our API. Our SPARQL endpoint now supports federated queries. All the neXtProt data are available via our user interface, API, SPARQL endpoint and FTP site, including the new PEFF 1.0 format files. Finally, the data on our FTP site is now CC BY 4.0 to promote its reuse.


Subject(s)
Databases, Protein , Knowledge Bases , Humans , Internet , Mass Spectrometry , Peptides/chemistry , Protein Kinases/chemistry , Protein Kinases/metabolism , Protein Processing, Post-Translational , Proteins/chemistry , Proteins/genetics , Proteins/metabolism , Sequence Analysis, RNA , Software , Trypsin , User-Computer Interface
8.
Database (Oxford) ; 20182018 01 01.
Article in English | MEDLINE | ID: mdl-30576492

ABSTRACT

The development of efficient text-mining tools promises to boost the curation workflow by significantly reducing the time needed to process the literature into biological databases. We have developed a curation support tool, neXtA5, that provides a search engine coupled with an annotation system directly integrated into a biocuration workflow. neXtA5 assists curation with modules optimized for the thevarious curation tasks: document triage, entity recognition and information extraction.Here, we describe the evaluation of neXtA5 by expert curators. We first assessed the annotations of two independent curators to provide a baseline for comparison. To evaluate the performance of neXtA5, we submitted requests and compared the neXtA5 results with the manual curation. The analysis focuses on the usability of neXtA5 to support the curation of two types of data: biological processes (BPs) and diseases (Ds). We evaluated the relevance of the papers proposed as well as the recall and precision of the suggested annotations.The evaluation of document triage by neXtA5 precision showed that both curators agree with neXtA5 for 67 (BP) and 63% (D) of abstracts, while curators agree on accepting or rejecting an abstract ~80% of the time. Hence, the precision of the triage system is satisfactory.For concept extraction, curators approved 35 (BP) and 25% (D) of the neXtA5 annotations. Conversely, neXtA5 successfully annotated up to 36 (BP) and 68% (D) of the terms identified by curators. The user feedback obtained in these tests highlighted the need for improvement in the ranking function of neXtA5 annotations. Therefore, we transformed the information extraction component into an annotation ranking system. This improvement results in a top precision (precision at first rank) of 59 (D) and 63% (BP). These results suggest that when considering only the first extracted entity, the current system achieves a precision comparable with expert biocurators.


Subject(s)
Data Curation/methods , Data Mining/methods , Databases, Factual , Software , Humans
10.
Bioinformatics ; 33(21): 3471-3472, 2017 Nov 01.
Article in English | MEDLINE | ID: mdl-28520855

ABSTRACT

SUMMARY: The neXtProt peptide uniqueness checker allows scientists to define which peptides can be used to validate the existence of human proteins, i.e. map uniquely versus multiply to human protein sequences taking into account isobaric substitutions, alternative splicing and single amino acid variants. AVAILABILITY AND IMPLEMENTATION: The pepx program is available at https://github.com/calipho-sib/pepx and can be launched from the command line or through a cgi web interface. Indexing requires a sequence file in FASTA format. The peptide uniqueness checker tool is freely available on the web at https://www.nextprot.org/tools/peptide-uniqueness-checker and from the neXtProt API at https://api.nextprot.org/. CONTACT: lydie.lane@sib.swiss.


Subject(s)
Peptides/analysis , Proteomics/methods , Software , Databases, Protein , Humans , Proteins/analysis
11.
Nucleic Acids Res ; 45(D1): D177-D182, 2017 01 04.
Article in English | MEDLINE | ID: mdl-27899619

ABSTRACT

The neXtProt human protein knowledgebase (https://www.nextprot.org) continues to add new content and tools, with a focus on proteomics and genetic variation data. neXtProt now has proteomics data for over 85% of the human proteins, as well as new tools tailored to the proteomics community.Moreover, the neXtProt release 2016-08-25 includes over 8000 phenotypic observations for over 4000 variations in a number of genes involved in hereditary cancers and channelopathies. These changes are presented in the current neXtProt update. All of the neXtProt data are available via our user interface and FTP site. We also provide an API access and a SPARQL endpoint for more technical applications.


Subject(s)
Databases, Protein , Proteomics , Genetic Association Studies , Genetic Variation , Humans , Internet , Phenotype , Proteomics/methods , Software , Web Browser
12.
Article in English | MEDLINE | ID: mdl-27374119

ABSTRACT

The rapid increase in the number of published articles poses a challenge for curated databases to remain up-to-date. To help the scientific community and database curators deal with this issue, we have developed an application, neXtA5, which prioritizes the literature for specific curation requirements. Our system, neXtA5, is a curation service composed of three main elements. The first component is a named-entity recognition module, which annotates MEDLINE over some predefined axes. This report focuses on three axes: Diseases, the Molecular Function and Biological Process sub-ontologies of the Gene Ontology (GO). The automatic annotations are then stored in a local database, BioMed, for each annotation axis. Additional entities such as species and chemical compounds are also identified. The second component is an existing search engine, which retrieves the most relevant MEDLINE records for any given query. The third component uses the content of BioMed to generate an axis-specific ranking, which takes into account the density of named-entities as stored in the Biomed database. The two ranked lists are ultimately merged using a linear combination, which has been specifically tuned to support the annotation of each axis. The fine-tuning of the coefficients is formally reported for each axis-driven search. Compared with PubMed, which is the system used by most curators, the improvement is the following: +231% for Diseases, +236% for Molecular Functions and +3153% for Biological Process when measuring the precision of the top-returned PMID (P0 or mean reciprocal rank). The current search methods significantly improve the search effectiveness of curators for three important curation axes. Further experiments are being performed to extend the curation types, in particular protein-protein interactions, which require specific relationship extraction capabilities. In parallel, user-friendly interfaces powered with a set of JSON web services are currently being implemented into the neXtProt annotation pipeline.Available on: http://babar.unige.ch:8082/neXtA5Database URL: http://babar.unige.ch:8082/neXtA5/fetcher.jsp.


Subject(s)
Data Curation/methods , Data Mining/methods , Electronic Data Processing/methods , MEDLINE , Search Engine/methods
13.
Nucleic Acids Res ; 43(Database issue): D764-70, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25593349

ABSTRACT

neXtProt (http://www.nextprot.org) is a human protein-centric knowledgebase developed at the SIB Swiss Institute of Bioinformatics. Focused solely on human proteins, neXtProt aims to provide a state of the art resource for the representation of human biology by capturing a wide range of data, precise annotations, fully traceable data provenance and a web interface which enables researchers to find and view information in a comprehensive manner. Since the introductory neXtProt publication, significant advances have been made on three main aspects: the representation of proteomics data, an extended representation of human variants and the development of an advanced search capability built around semantic technologies. These changes are presented in the current neXtProt update.


Subject(s)
Databases, Protein , Genetic Variation , Proteins/genetics , Proteomics , Cell Line , Disease/genetics , Humans , Internet , Proteome
14.
Eur J Anaesthesiol ; 29(9): 446-51, 2012 Sep.
Article in English | MEDLINE | ID: mdl-22828385

ABSTRACT

CONTEXT: Standardised drug syringe labelling may reduce drug errors, but data on drug syringe labelling use in European anaesthesiology departments are lacking. OBJECTIVES: Survey investigating if standardised drug syringe labelling is used, and if there are geographical, demographic and professional differences in hospitals with and without use of drug syringe labelling. DESIGN: Structured, web-based anonymised questionnaire. SETTING: European anaesthesia departments. PARTICIPANTS: Members of the European Society of Anaesthesiology. INTERVENTION: Online survey from 2 February to 12 April 2011. MAIN OUTCOME MEASURE: Standardised drug syringe labelling use and, if yes, drug syringe labelling for insulin and norepinephrine. METHODS: Descriptive and comparative analyses of users and nonusers of standardised drug syringe labelling. RESULTS: One thousand and sixty-four of 4163 members (25.6%) from 72 countries participated, among whom 660 (62.0%) used standardised drug syringe labelling; in Northern and Western Europe, there were 428 users of drug syringe labelling and 112 nonusers, and in Southern and Eastern Europe, there were 184 users and 255 nonusers (P < 0.001). Three hundred and ninety-four (37%) respondents used standardised drug syringe labelling hospital-wide; 202 (30.1%) used International Organisation of Standardisation-based standardised drug syringe labelling, 101 (15.1%) used similar systems, 278 (41.5%) used other systems and 89 (13.3%) used labels supplied by drug manufacturers. The label colour for insulin was reported as white or 'none' in 519 (76.7%) answers and another colour in 158 (23.3%). The label colour for norepinephrine was reported as violet in 206 (30.4%) answers, white or 'none' in 226 (33.3%), red in 114 (16.8%) and another colour in 132 (19.5%). A standardised drug syringe labelling system supplied by the pharmaceutical industry was supported by 819 (76.9%) respondents, and not supported by 227 (21.3%). CONCLUSION: A majority of European anaesthesiology departments used standardised drug syringe labelling, with regional differences and mostly without following an international standard. Thus, there are options for quality improvement in drug syringe labelling.


Subject(s)
Anesthesiology , Drug Labeling/standards , Syringes , Drug Industry , Europe , Humans , Surveys and Questionnaires
SELECTION OF CITATIONS
SEARCH DETAIL
...