Your browser doesn't support javascript.
A protocol for adding knowledge to Wikidata: aligning resources on human coronaviruses.
Waagmeester, Andra; Willighagen, Egon L; Su, Andrew I; Kutmon, Martina; Gayo, Jose Emilio Labra; Fernández-Álvarez, Daniel; Groom, Quentin; Schaap, Peter J; Verhagen, Lisa M; Koehorst, Jasper J.
  • Waagmeester A; Micelio, Antwerpen, Belgium.
  • Willighagen EL; Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, The Netherlands.
  • Su AI; Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA.
  • Kutmon M; Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, The Netherlands.
  • Gayo JEL; Maastricht Centre for Systems Biology (MaCSBio), Maastricht University, Maastricht, The Netherlands.
  • Fernández-Álvarez D; WESO Research Group, University of Oviedo, Oviedo, Spain.
  • Groom Q; WESO Research Group, University of Oviedo, Oviedo, Spain.
  • Schaap PJ; Meise Botanic Garden, Meise, Belgium.
  • Verhagen LM; Department of Agrotechnology and Food Sciences, Laboratory of Systems and Synthetic Biology, Wageningen University & Research, Wageningen, The Netherlands.
  • Koehorst JJ; Intravacc, PO Box 450, 3720 AL, Bilthoven, The Netherlands.
BMC Biol ; 19(1): 12, 2021 01 22.
Artículo en Inglés | MEDLINE | ID: covidwho-1044598
ABSTRACT

BACKGROUND:

Pandemics, even more than other medical problems, require swift integration of knowledge. When caused by a new virus, understanding the underlying biology may help finding solutions. In a setting where there are a large number of loosely related projects and initiatives, we need common ground, also known as a "commons." Wikidata, a public knowledge graph aligned with Wikipedia, is such a commons and uses unique identifiers to link knowledge in other knowledge bases. However, Wikidata may not always have the right schema for the urgent questions. In this paper, we address this problem by showing how a data schema required for the integration can be modeled with entity schemas represented by Shape Expressions.

RESULTS:

As a telling example, we describe the process of aligning resources on the genomes and proteomes of the SARS-CoV-2 virus and related viruses as well as how Shape Expressions can be defined for Wikidata to model the knowledge, helping others studying the SARS-CoV-2 pandemic. How this model can be used to make data between various resources interoperable is demonstrated by integrating data from NCBI (National Center for Biotechnology Information) Taxonomy, NCBI Genes, UniProt, and WikiPathways. Based on that model, a set of automated applications or bots were written for regular updates of these sources in Wikidata and added to a platform for automatically running these updates.

CONCLUSIONS:

Although this workflow is developed and applied in the context of the COVID-19 pandemic, to demonstrate its broader applicability it was also applied to other human coronaviruses (MERS, SARS, human coronavirus NL63, human coronavirus 229E, human coronavirus HKU1, human coronavirus OC4).
Asunto(s)
Palabras clave

Texto completo: Disponible Colección: Bases de datos internacionales Base de datos: MEDLINE Asunto principal: Genómica / Proteómica / Bases del Conocimiento / SARS-CoV-2 / COVID-19 Límite: Humanos Idioma: Inglés Revista: BMC Biol Asunto de la revista: Biologia Año: 2021 Tipo del documento: Artículo País de afiliación: S12915-020-00940-y

Similares

MEDLINE

...
LILACS

LIS


Texto completo: Disponible Colección: Bases de datos internacionales Base de datos: MEDLINE Asunto principal: Genómica / Proteómica / Bases del Conocimiento / SARS-CoV-2 / COVID-19 Límite: Humanos Idioma: Inglés Revista: BMC Biol Asunto de la revista: Biologia Año: 2021 Tipo del documento: Artículo País de afiliación: S12915-020-00940-y