Search | VHL Regional Portal

1.

Can Small Molecules Provide Clues on Disease Progression in Cerebrospinal Fluid from Mild Cognitive Impairment and Alzheimer's Disease Patients?

Talavera Andújar, Begoña; Mary, Arnaud; Venegas, Carmen; Cheng, Tiejun; Zaslavsky, Leonid; Bolton, Evan E; Heneka, Michael T; Schymanski, Emma L.

Environ Sci Technol ; 58(9): 4181-4192, 2024 Mar 05.

Article in English | MEDLINE | ID: mdl-38373301

ABSTRACT

Alzheimer's disease (AD) is a complex and multifactorial neurodegenerative disease, which is currently diagnosed via clinical symptoms and nonspecific biomarkers (such as Aß1-42, t-Tau, and p-Tau) measured in cerebrospinal fluid (CSF), which alone do not provide sufficient insights into disease progression. In this pilot study, these biomarkers were complemented with small-molecule analysis using non-target high-resolution mass spectrometry coupled with liquid chromatography (LC) on the CSF of three groups: AD, mild cognitive impairment (MCI) due to AD, and a non-demented (ND) control group. An open-source cheminformatics pipeline based on MS-DIAL and patRoon was enhanced using CSF- and AD-specific suspect lists to assist in data interpretation. Chemical Similarity Enrichment Analysis revealed a significant increase of hydroxybutyrates in AD, including 3-hydroxybutanoic acid, which was found at higher levels in AD compared to MCI and ND. Furthermore, a highly sensitive target LC-MS method was used to quantify 35 bile acids (BAs) in the CSF, revealing several statistically significant differences including higher dehydrolithocholic acid levels and decreased conjugated BA levels in AD. This work provides several promising small-molecule hypotheses that could be used to help track the progression of AD in CSF samples.

Subject(s)

Alzheimer Disease , Cognitive Dysfunction , Neurodegenerative Diseases , Humans , Alzheimer Disease/cerebrospinal fluid , Alzheimer Disease/diagnosis , Alzheimer Disease/psychology , tau Proteins/cerebrospinal fluid , Amyloid beta-Peptides/cerebrospinal fluid , Pilot Projects , Cognitive Dysfunction/cerebrospinal fluid , Cognitive Dysfunction/diagnosis , Cognitive Dysfunction/psychology , Biomarkers , Disease Progression

2.

Bridging glycoinformatics and cheminformatics: integration efforts between GlyCosmos and PubChem.

Cheng, Tiejun; Ono, Tamiko; Shiota, Masaaki; Yamada, Issaku; Aoki-Kinoshita, Kiyoko F; Bolton, Evan E.

Glycobiology ; 33(6): 454-463, 2023 06 21.

Article in English | MEDLINE | ID: mdl-37129482

ABSTRACT

The GlyCosmos Glycoscience Portal (https://glycosmos.org) and PubChem (https://pubchem.ncbi.nlm.nih.gov/) are major portals for glycoscience and chemistry, respectively. GlyCosmos is a portal for glycan-related repositories, including GlyTouCan, GlycoPOST, and UniCarb-DR, as well as for glycan-related data resources that have been integrated from a variety of 'omics databases. Glycogenes, glycoproteins, lectins, pathways, and disease information related to glycans are accessible from GlyCosmos. PubChem, on the other hand, is a chemistry-based portal at the National Center for Biotechnology Information. PubChem provides information not only on chemicals, but also genes, proteins, pathways, as well as patents, bioassays, and more, from hundreds of data resources from around the world. In this work, these 2 portals have made substantial efforts to integrate their complementary data to allow users to cross between these 2 domains. In addition to glycan structures, key information, such as glycan-related genes, relevant diseases, glycoproteins, and pathways, was integrated and cross-linked with one another. The interfaces were designed to enable users to easily find, access, download, and reuse data of interest across these resources. Use cases are described illustrating and highlighting the type of content that can be investigated. In total, these integrations provide life science researchers improved awareness and enhanced access to glycan-related information.

Subject(s)

Databases, Chemical , Polysaccharides , Glycosylation , Workflow , Informatics , Polysaccharides/chemistry , Glycoconjugates/chemistry

3.

PubChem 2023 update.

Kim, Sunghwan; Chen, Jie; Cheng, Tiejun; Gindulyte, Asta; He, Jia; He, Siqian; Li, Qingliang; Shoemaker, Benjamin A; Thiessen, Paul A; Yu, Bo; Zaslavsky, Leonid; Zhang, Jian; Bolton, Evan E.

Nucleic Acids Res ; 51(D1): D1373-D1380, 2023 01 06.

Article in English | MEDLINE | ID: mdl-36305812

ABSTRACT

PubChem (https://pubchem.ncbi.nlm.nih.gov) is a popular chemical information resource that serves a wide range of use cases. In the past two years, a number of changes were made to PubChem. Data from more than 120 data sources was added to PubChem. Some major highlights include: the integration of Google Patents data into PubChem, which greatly expanded the coverage of the PubChem Patent data collection; the creation of the Cell Line and Taxonomy data collections, which provide quick and easy access to chemical information for a given cell line and taxon, respectively; and the update of the bioassay data model. In addition, new functionalities were added to the PubChem programmatic access protocols, PUG-REST and PUG-View, including support for target-centric data download for a given protein, gene, pathway, cell line, and taxon and the addition of the 'standardize' option to PUG-REST, which returns the standardized form of an input chemical structure. A significant update was also made to PubChemRDF. The present paper provides an overview of these changes.

Subject(s)

Databases, Chemical , Drug Discovery , Drug Discovery/methods , Biological Assay , Proteins , Cheminformatics

4.

The NORMAN Suspect List Exchange (NORMAN-SLE): facilitating European and worldwide collaboration on suspect screening in high resolution mass spectrometry.

Mohammed Taha, Hiba; Aalizadeh, Reza; Alygizakis, Nikiforos; Antignac, Jean-Philippe; Arp, Hans Peter H; Bade, Richard; Baker, Nancy; Belova, Lidia; Bijlsma, Lubertus; Bolton, Evan E; Brack, Werner; Celma, Alberto; Chen, Wen-Ling; Cheng, Tiejun; Chirsir, Parviel; Cirka, Lubos; D'Agostino, Lisa A; Djoumbou Feunang, Yannick; Dulio, Valeria; Fischer, Stellan; Gago-Ferrero, Pablo; Galani, Aikaterini; Geueke, Birgit; Glowacka, Natalia; Glüge, Juliane; Groh, Ksenia; Grosse, Sylvia; Haglund, Peter; Hakkinen, Pertti J; Hale, Sarah E; Hernandez, Felix; Janssen, Elisabeth M-L; Jonkers, Tim; Kiefer, Karin; Kirchner, Michal; Koschorreck, Jan; Krauss, Martin; Krier, Jessy; Lamoree, Marja H; Letzel, Marion; Letzel, Thomas; Li, Qingliang; Little, James; Liu, Yanna; Lunderberg, David M; Martin, Jonathan W; McEachran, Andrew D; McLean, John A; Meier, Christiane; Meijer, Jeroen.

Environ Sci Eur ; 34(1): 104, 2022.

Article in English | MEDLINE | ID: mdl-36284750

ABSTRACT

Background: The NORMAN Association (https://www.norman-network.com/) initiated the NORMAN Suspect List Exchange (NORMAN-SLE; https://www.norman-network.com/nds/SLE/) in 2015, following the NORMAN collaborative trial on non-target screening of environmental water samples by mass spectrometry. Since then, this exchange of information on chemicals that are expected to occur in the environment, along with the accompanying expert knowledge and references, has become a valuable knowledge base for "suspect screening" lists. The NORMAN-SLE now serves as a FAIR (Findable, Accessible, Interoperable, Reusable) chemical information resource worldwide. Results: The NORMAN-SLE contains 99 separate suspect list collections (as of May 2022) from over 70 contributors around the world, totalling over 100,000 unique substances. The substance classes include per- and polyfluoroalkyl substances (PFAS), pharmaceuticals, pesticides, natural toxins, high production volume substances covered under the European REACH regulation (EC: 1272/2008), priority contaminants of emerging concern (CECs) and regulatory lists from NORMAN partners. Several lists focus on transformation products (TPs) and complex features detected in the environment with various levels of provenance and structural information. Each list is available for separate download. The merged, curated collection is also available as the NORMAN Substance Database (NORMAN SusDat). Both the NORMAN-SLE and NORMAN SusDat are integrated within the NORMAN Database System (NDS). The individual NORMAN-SLE lists receive digital object identifiers (DOIs) and traceable versioning via a Zenodo community (https://zenodo.org/communities/norman-sle), with a total of > 40,000 unique views, > 50,000 unique downloads and 40 citations (May 2022). NORMAN-SLE content is progressively integrated into large open chemical databases such as PubChem (https://pubchem.ncbi.nlm.nih.gov/) and the US EPA's CompTox Chemicals Dashboard (https://comptox.epa.gov/dashboard/), enabling further access to these lists, along with the additional functionality and calculated properties these resources offer. PubChem has also integrated significant annotation content from the NORMAN-SLE, including a classification browser (https://pubchem.ncbi.nlm.nih.gov/classification/#hid=101). Conclusions: The NORMAN-SLE offers a specialized service for hosting suspect screening lists of relevance for the environmental community in an open, FAIR manner that allows integration with other major chemical resources. These efforts foster the exchange of information between scientists and regulators, supporting the paradigm shift to the "one substance, one assessment" approach. New submissions are welcome via the contacts provided on the NORMAN-SLE website (https://www.norman-network.com/nds/SLE/). Supplementary Information: The online version contains supplementary material available at 10.1186/s12302-022-00680-6.

5.

Studying the Parkinson's disease metabolome and exposome in biological samples through different analytical and cheminformatics approaches: a pilot study.

Talavera Andújar, Begoña; Aurich, Dagny; Aho, Velma T E; Singh, Randolph R; Cheng, Tiejun; Zaslavsky, Leonid; Bolton, Evan E; Mollenhauer, Brit; Wilmes, Paul; Schymanski, Emma L.

Anal Bioanal Chem ; 414(25): 7399-7419, 2022 Oct.

Article in English | MEDLINE | ID: mdl-35829770

ABSTRACT

Parkinson's disease (PD) is the second most prevalent neurodegenerative disease, with an increasing incidence in recent years due to the aging population. Genetic mutations alone only explain <10% of PD cases, while environmental factors, including small molecules, may play a significant role in PD. In the present work, 22 plasma (11 PD, 11 control) and 19 feces samples (10 PD, 9 control) were analyzed by non-target high-resolution mass spectrometry (NT-HRMS) coupled to two liquid chromatography (LC) methods (reversed-phase (RP) and hydrophilic interaction liquid chromatography (HILIC)). A cheminformatics workflow was optimized using open software (MS-DIAL and patRoon) and open databases (all public MSP-formatted spectral libraries for MS-DIAL, PubChemLite for Exposomics, and the LITMINEDNEURO list for patRoon). Furthermore, five disease-specific databases and three suspect lists (on PD and related disorders) were developed, using PubChem functionality to identifying relevant unknown chemicals. The results showed that non-target screening with the larger databases generally provided better results compared with smaller suspect lists. However, two suspect screening approaches with patRoon were also good options to study specific chemicals in PD. The combination of chromatographic methods (RP and HILIC) as well as two ionization modes (positive and negative) enhanced the coverage of chemicals in the biological samples. While most metabolomics studies in PD have focused on blood and cerebrospinal fluid, we found a higher number of relevant features in feces, such as alanine betaine or nicotinamide, which can be directly metabolized by gut microbiota. This highlights the potential role of gut dysbiosis in PD development.

Subject(s)

Exposome , Neurodegenerative Diseases , Parkinson Disease , Aged , Alanine , Betaine , Cheminformatics , Humans , Metabolome , Metabolomics/methods , Niacinamide , Pilot Projects

6.

PubChem Protein, Gene, Pathway, and Taxonomy Data Collections: Bridging Biology and Chemistry through Target-Centric Views of PubChem Data.

Kim, Sunghwan; Cheng, Tiejun; He, Siqian; Thiessen, Paul A; Li, Qingliang; Gindulyte, Asta; Bolton, Evan E.

J Mol Biol ; 434(11): 167514, 2022 06 15.

Article in English | MEDLINE | ID: mdl-35227770

ABSTRACT

PubChem (https://pubchem.ncbi.nlm.nih.gov) is a public chemical database at the U.S. National Institutes of Health. Visited by millions of users every month, it plays a role as a key chemical information resource for biomedical research communities. Data in PubChem is from hundreds of contributors and organized into multiple collections by record type. Among these are the Protein, Gene, Pathway, and Taxonomy data collections. Records in these collections contain information on chemicals related to a given biological target (i.e., protein, gene, pathway, or taxon), helping users to analyze and interpret the biological activity data of molecules. In addition, annotations about the biological targets are collected from authoritative or curated data sources and integrated into the four collections. The content can be programmatically accessed through PubChem's web service interfaces (including PUG View). A machine-readable representation of this content is also provided within PubChemRDF.

Subject(s)

Databases, Chemical , Biology , Drug Discovery , Proteins/genetics

7.

iCn3D: From Web-Based 3D Viewer to Structural Analysis Tool in Batch Mode.

Wang, Jiyao; Youkharibache, Philippe; Marchler-Bauer, Aron; Lanczycki, Christopher; Zhang, Dachuan; Lu, Shennan; Madej, Thomas; Marchler, Gabriele H; Cheng, Tiejun; Chong, Li Chuin; Zhao, Sarah; Yang, Kevin; Lin, Jack; Cheng, Zhiyu; Dunn, Rachel; Malkaram, Sridhar Acharya; Tai, Chin-Hsien; Enoma, David; Busby, Ben; Johnson, Nicholas L; Tabaro, Francesco; Song, Guangfeng; Ge, Yuchen.

Front Mol Biosci ; 9: 831740, 2022.

Article in English | MEDLINE | ID: mdl-35252351

ABSTRACT

iCn3D was initially developed as a web-based 3D molecular viewer. It then evolved from visualization into a full-featured interactive structural analysis software. It became a collaborative research instrument through the sharing of permanent, shortened URLs that encapsulate not only annotated visual molecular scenes, but also all underlying data and analysis scripts in a FAIR manner. More recently, with the growth of structural databases, the need to analyze large structural datasets systematically led us to use Python scripts and convert the code to be used in Node. js scripts. We showed a few examples of Python scripts at https://github.com/ncbi/icn3d/tree/master/icn3dpython to export secondary structures or PNG images from iCn3D. Users just need to replace the URL in the Python scripts to export other annotations from iCn3D. Furthermore, any interactive iCn3D feature can be converted into a Node. js script to be run in batch mode, enabling an interactive analysis performed on one or a handful of protein complexes to be scaled up to analysis features of large ensembles of structures. Currently available Node. js analysis scripts examples are available at https://github.com/ncbi/icn3d/tree/master/icn3dnode. This development will enable ensemble analyses on growing structural databases such as AlphaFold or RoseTTAFold on one hand and Electron Microscopy on the other. In this paper, we also review new features such as DelPhi electrostatic potential, 3D view of mutations, alignment of multiple chains, assembly of multiple structures by realignment, dynamic symmetry calculation, 2D cartoons at different levels, interactive contact maps, and use of iCn3D in Jupyter Notebook as described at https://pypi.org/project/icn3dpy.

8.

Plant Reactome and PubChem: The Plant Pathway and (Bio)Chemical Entity Knowledgebases.

Gupta, Parul; Naithani, Sushma; Preece, Justin; Kim, Sunghwan; Cheng, Tiejun; D'Eustachio, Peter; Elser, Justin; Bolton, Evan E; Jaiswal, Pankaj.

Methods Mol Biol ; 2443: 511-525, 2022.

Article in English | MEDLINE | ID: mdl-35037224

ABSTRACT

Plant Reactome (https://plantreactome.gramene.org) and PubChem ( https://pubchem.ncbi.nlm.nih.gov ) are two reference data portals and resources for curated plant pathways, small molecules, metabolites, gene products, and macromolecular interactions. Plant Reactome knowledgebase, a conceptual plant pathway network, is built by biocuration and integrating (bio)chemical entities, gene products, and macromolecular interactions. It provides manually curated pathways for the reference species Oryza sativa (rice) and gene orthology-based projections that extend pathway knowledge to 106 plant species. Currently, it hosts 320 reference pathways for plant metabolism, hormone signaling, transport, genetic regulation, plant organ development and differentiation, and biotic and abiotic stress responses. In addition to the pathway browsing and search functions, the Plant Reactome provides the analysis tools for pathway comparison between reference and projected species, pathway enrichment in gene expression data, and overlay of gene-gene interaction data on pathways. PubChem, a popular reference database of (bio)chemical entities, provides information on small molecules and other types of chemical entities, such as siRNAs, miRNAs, lipids, carbohydrates, and chemically modified nucleotides. The data in PubChem is collected from hundreds of data sources, including Plant Reactome. This chapter provides a brief overview of the Plant Reactome and the PubChem knowledgebases, their association to other public resources providing accessory information, and how users can readily access the contents.

Subject(s)

Knowledge Bases , Metabolic Networks and Pathways , Databases, Factual , Plants/genetics , Plants/metabolism , Proteins/metabolism

9.

Discovering and Summarizing Relationships Between Chemicals, Genes, Proteins, and Diseases in PubChem.

Zaslavsky, Leonid; Cheng, Tiejun; Gindulyte, Asta; He, Siqian; Kim, Sunghwan; Li, Qingliang; Thiessen, Paul; Yu, Bo; Bolton, Evan E.

Front Res Metr Anal ; 6: 689059, 2021.

Article in English | MEDLINE | ID: mdl-34322655

ABSTRACT

The literature knowledge panels developed and implemented in PubChem are described. These help to uncover and summarize important relationships between chemicals, genes, proteins, and diseases by analyzing co-occurrences of terms in biomedical literature abstracts. Named entities in PubMed records are matched with chemical names in PubChem, disease names in Medical Subject Headings (MeSH), and gene/protein names in popular gene/protein information resources, and the most closely related entities are identified using statistical analysis and relevance-based sampling. Knowledge panels for the co-occurrence of chemical, disease, and gene/protein entities are included in PubChem Compound, Protein, and Gene pages, summarizing these in a compact form. Statistical methods for removing redundancy and estimating relevance scores are discussed, along with benefits and pitfalls of relying on automated (i.e., not human-curated) methods operating on data from multiple heterogeneous sources.

10.

Enhancing the interoperability of glycan data flow between ChEBI, PubChem and GlyGen.

Navelkar, Rahi; Owen, Gareth; Mutherkrishnan, Venkatesh; Thiessen, Paul; Cheng, Tiejun; Bolton, Evan; Edwards, Nathan; Tiemeyer, Michael; Campbell, Matthew P; Martin, Maria; Vora, Jeet; Kahsay, Robel; Mazumder, Raja.

Glycobiology ; 31(11): 1510-1519, 2021 12 18.

Article in English | MEDLINE | ID: mdl-34314492

ABSTRACT

Glycans play a vital role in health, disease, bioenergy, biomaterials and bio-therapeutics. As a result, there is keen interest to identify and increase glycan data in bioinformatics databases like ChEBI and PubChem, and connecting them to resources at the EMBL-EBI and NCBI to facilitate access to important annotations at a global level. GlyTouCan is a comprehensive archival database that contains glycans obtained primarily through batch upload from glycan repositories, glycoprotein databases and individual laboratories. In many instances, the glycan structures deposited in GlyTouCan may not be fully defined or have supporting experimental evidence and citations. Databases like ChEBI and PubChem were designed to accommodate complete atomistic structures with well-defined chemical linkages. As a result, they cannot easily accommodate the structural ambiguity inherent in glycan databases. Consequently, there is a need to improve the organization of glycan data coherently to enhance connectivity across the major NCBI, EMBL-EBI and glycoscience databases. This paper outlines a workflow developed in collaboration between GlyGen, ChEBI and PubChem to improve the visibility and connectivity of glycan data across these resources. GlyGen hosts a subset of glycans (~29,000) from the GlyTouCan database and has submitted valuable glycan annotations to the PubChem database and integrated over 10,500 (including ambiguously defined) glycans into the ChEBI database. The integrated glycans were prioritized based on links to PubChem and connectivity to glycoprotein data. The pipeline provides a blueprint for how glycan data can be harmonized between different resources. The current PubChem, ChEBI and GlyTouCan mappings can be downloaded from GlyGen (https://data.glygen.org).

Subject(s)

Databases, Chemical , Glycoproteins/chemistry , Polysaccharides/chemistry , Software , Carbohydrate Conformation , Glycomics

11.

PubChem in 2021: new data content and improved web interfaces.

Kim, Sunghwan; Chen, Jie; Cheng, Tiejun; Gindulyte, Asta; He, Jia; He, Siqian; Li, Qingliang; Shoemaker, Benjamin A; Thiessen, Paul A; Yu, Bo; Zaslavsky, Leonid; Zhang, Jian; Bolton, Evan E.

Nucleic Acids Res ; 49(D1): D1388-D1395, 2021 01 08.

Article in English | MEDLINE | ID: mdl-33151290

ABSTRACT

PubChem (https://pubchem.ncbi.nlm.nih.gov) is a popular chemical information resource that serves the scientific community as well as the general public, with millions of unique users per month. In the past two years, PubChem made substantial improvements. Data from more than 100 new data sources were added to PubChem, including chemical-literature links from Thieme Chemistry, chemical and physical property links from SpringerMaterials, and patent links from the World Intellectual Properties Organization (WIPO). PubChem's homepage and individual record pages were updated to help users find desired information faster. This update involved a data model change for the data objects used by these pages as well as by programmatic users. Several new services were introduced, including the PubChem Periodic Table and Element pages, Pathway pages, and Knowledge panels. Additionally, in response to the coronavirus disease 2019 (COVID-19) outbreak, PubChem created a special data collection that contains PubChem data related to COVID-19 and the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2).

Subject(s)

COVID-19/prevention & control , Databases, Chemical , Information Storage and Retrieval/statistics & numerical data , SARS-CoV-2/isolation & purification , User-Computer Interface , COVID-19/epidemiology , COVID-19/virology , Drug Discovery/statistics & numerical data , Epidemics , Humans , Information Storage and Retrieval/methods , Internet , Public Health/statistics & numerical data , SARS-CoV-2/physiology , Software

12.

Plant Reactome: a knowledgebase and resource for comparative pathway analysis.

Naithani, Sushma; Gupta, Parul; Preece, Justin; D'Eustachio, Peter; Elser, Justin L; Garg, Priyanka; Dikeman, Daemon A; Kiff, Jason; Cook, Justin; Olson, Andrew; Wei, Sharon; Tello-Ruiz, Marcela K; Mundo, Antonio Fabregat; Munoz-Pomer, Alfonso; Mohammed, Suhaib; Cheng, Tiejun; Bolton, Evan; Papatheodorou, Irene; Stein, Lincoln; Ware, Doreen; Jaiswal, Pankaj.

Nucleic Acids Res ; 48(D1): D1093-D1103, 2020 01 08.

Article in English | MEDLINE | ID: mdl-31680153

ABSTRACT

Plant Reactome (https://plantreactome.gramene.org) is an open-source, comparative plant pathway knowledgebase of the Gramene project. It uses Oryza sativa (rice) as a reference species for manual curation of pathways and extends pathway knowledge to another 82 plant species via gene-orthology projection using the Reactome data model and framework. It currently hosts 298 reference pathways, including metabolic and transport pathways, transcriptional networks, hormone signaling pathways, and plant developmental processes. In addition to browsing plant pathways, users can upload and analyze their omics data, such as the gene-expression data, and overlay curated or experimental gene-gene interaction data to extend pathway knowledge. The curation team actively engages researchers and students on gene and pathway curation by offering workshops and online tutorials. The Plant Reactome supports, implements and collaborates with the wider community to make data and tools related to genes, genomes, and pathways Findable, Accessible, Interoperable and Re-usable (FAIR).

Subject(s)

Computational Biology/methods , Databases, Genetic , Genomics , Metabolomics , Plants/genetics , Plants/metabolism , Proteomics , Gene Regulatory Networks , Genomics/methods , Humans , Metabolic Networks and Pathways , Metabolomics/methods , Proteomics/methods , Signal Transduction , Web Browser

13.

PUG-View: programmatic access to chemical annotations integrated in PubChem.

Kim, Sunghwan; Thiessen, Paul A; Cheng, Tiejun; Zhang, Jian; Gindulyte, Asta; Bolton, Evan E.

J Cheminform ; 11(1): 56, 2019 Aug 09.

Article in English | MEDLINE | ID: mdl-31399858

ABSTRACT

PubChem is a chemical data repository that provides comprehensive information on various chemical entities. It contains a wealth of chemical information from hundreds of data sources. Programmatic access to this large amount of data provides researchers with new opportunities for data-intensive research. PubChem provides several programmatic access routes. One of these is PUG-View, which is a Representational State Transfer (REST)-style web service interface specialized for accessing annotation data contained in PubChem. The present paper describes various aspects of PUG-View, including the scope of data accessible through PUG-View, the syntax for formulating a PUG-View request URL, the difference of PUG-View from other web service interfaces in PubChem, and its limitations and usage policies.

14.

PubChem 2019 update: improved access to chemical data.

Kim, Sunghwan; Chen, Jie; Cheng, Tiejun; Gindulyte, Asta; He, Jia; He, Siqian; Li, Qingliang; Shoemaker, Benjamin A; Thiessen, Paul A; Yu, Bo; Zaslavsky, Leonid; Zhang, Jian; Bolton, Evan E.

Nucleic Acids Res ; 47(D1): D1102-D1109, 2019 01 08.

Article in English | MEDLINE | ID: mdl-30371825

ABSTRACT

PubChem (https://pubchem.ncbi.nlm.nih.gov) is a key chemical information resource for the biomedical research community. Substantial improvements were made in the past few years. New data content was added, including spectral information, scientific articles mentioning chemicals, and information for food and agricultural chemicals. PubChem released new web interfaces, such as PubChem Target View page, Sources page, Bioactivity dyad pages and Patent View page. PubChem also released a major update to PubChem Widgets and introduced a new programmatic access interface, called PUG-View. This paper describes these new developments in PubChem.

Subject(s)

Computational Biology/methods , Databases, Chemical , Pharmaceutical Preparations/chemistry , Small Molecule Libraries/chemistry , Animals , Biological Assay/methods , Drug Discovery/methods , High-Throughput Screening Assays/methods , Humans , Information Storage and Retrieval/methods , Internet , Molecular Structure , Patents as Topic , Structure-Activity Relationship

15.

An update on PUG-REST: RESTful interface for programmatic access to PubChem.

Kim, Sunghwan; Thiessen, Paul A; Cheng, Tiejun; Yu, Bo; Bolton, Evan E.

Nucleic Acids Res ; 46(W1): W563-W570, 2018 07 02.

Article in English | MEDLINE | ID: mdl-29718389

ABSTRACT

PubChem (https://pubchem.ncbi.nlm.nih.gov) is one of the largest open chemical information resources available. It currently receives millions of unique users per month on average, serving as a key resource for many research fields such as cheminformatics, chemical biology, medicinal chemistry, and drug discovery. PubChem provides multiple programmatic access routes to its data and services. One of them is PUG-REST, a Representational State Transfer (REST)-like web service interface to PubChem. On average, PUG-REST receives more than a million requests per day from tens of thousands of unique users. The present paper provides an update on PUG-REST since our previous paper published in 2015. This includes access to new kinds of data (e.g. concise bioactivity data, table of contents headings, etc.), full implementation of synchronous fast structure search, support for assay data retrieval using accession identifiers in response to the deprecation of NCBI's GI numbers, data exchange between PUG-REST and NCBI's E-Utilities through the List Gateway, implementation of dynamic traffic control through throttling, and enhanced usage policies. In addition, example Perl scripts are provided, which the user can easily modify, run, or translate into another scripting language.

Subject(s)

Chemistry, Pharmaceutical/methods , Drug Discovery/methods , Programming Languages , User-Computer Interface , Databases, Chemical , Humans , Information Storage and Retrieval/methods , Internet , Small Molecule Libraries/pharmacology

16.

Large-Scale Prediction of Drug-Target Interaction: a Data-Centric Review.

Cheng, Tiejun; Hao, Ming; Takeda, Takako; Bryant, Stephen H; Wang, Yanli.

AAPS J ; 19(5): 1264-1275, 2017 09.

Article in English | MEDLINE | ID: mdl-28577120

ABSTRACT

The prediction of drug-target interactions (DTIs) is of extraordinary significance to modern drug discovery in terms of suggesting new drug candidates and repositioning old drugs. Despite technological advances, large-scale experimental determination of DTIs is still expensive and laborious. Effective and low-cost computational alternatives remain in strong need. Meanwhile, open-access resources have been rapidly growing with massive amount of bioactivity data becoming available, creating unprecedented opportunities for the development of novel in silico models for large-scale DTI prediction. In this work, we review the state-of-the-art computational approaches for identifying DTIs from a data-centric perspective: what the underlying data are and how they are utilized in each study. We also summarize popular public data resources and online tools for DTI prediction. It is found that various types of data were employed including properties of chemical structures, drug therapeutic effects and side effects, drug-target binding, drug-drug interactions, bioactivity data of drug molecules across multiple biological targets, and drug-induced gene expressions. More often, the heterogeneous data were integrated to offer better performance. However, challenges remain such as handling data imbalance, incorporating negative samples and quantitative bioactivity data, as well as maintaining cross-links among different data sources, which are essential for large-scale and automated information integration.

Subject(s)

Drug Discovery , Binding Sites , Drug Interactions , Drug Repositioning , Humans

17.

PubChem BioAssay: A Decade's Development toward Open High-Throughput Screening Data Sharing.

Wang, Yanli; Cheng, Tiejun; Bryant, Stephen H.

SLAS Discov ; 22(6): 655-666, 2017 07.

Article in English | MEDLINE | ID: mdl-28346087

ABSTRACT

High-throughput screening (HTS) is now routinely conducted for drug discovery by both pharmaceutical companies and screening centers at academic institutions and universities. Rapid advance in assay development, robot automation, and computer technology has led to the generation of terabytes of data in screening laboratories. Despite the technology development toward HTS productivity, fewer efforts were devoted to HTS data integration and sharing. As a result, the huge amount of HTS data was rarely made available to the public. To fill this gap, the PubChem BioAssay database ( https://www.ncbi.nlm.nih.gov/pcassay/ ) was set up in 2004 to provide open access to the screening results tested on chemicals and RNAi reagents. With more than 10 years' development and contributions from the community, PubChem has now become the largest public repository for chemical structures and biological data, which provides an information platform to worldwide researchers supporting drug development, medicinal chemistry study, and chemical biology research. This work presents a review of the HTS data content in the PubChem BioAssay database and the progress of data deposition to stimulate knowledge discovery and data sharing. It also provides a description of the database's data standard and basic utilities facilitating information access and use for new users.

Subject(s)

Databases, Factual , High-Throughput Screening Assays , Information Dissemination , Computational Biology/methods , High-Throughput Screening Assays/methods , RNA Interference , RNA, Small Interfering , Small Molecule Libraries , Web Browser

18.

Predicting drug-drug interactions through drug structural similarities and interaction networks incorporating pharmacokinetics and pharmacodynamics knowledge.

Takeda, Takako; Hao, Ming; Cheng, Tiejun; Bryant, Stephen H; Wang, Yanli.

J Cheminform ; 9: 16, 2017.

Article in English | MEDLINE | ID: mdl-28316654

ABSTRACT

Drug-drug interactions (DDIs) may lead to adverse effects and potentially result in drug withdrawal from the market. Predicting DDIs during drug development would help reduce development costs and time by rigorous evaluation of drug candidates. The primary mechanisms of DDIs are based on pharmacokinetics (PK) and pharmacodynamics (PD). This study examines the effects of 2D structural similarities of drugs on DDI prediction through interaction networks including both PD and PK knowledge. Our assumption was that a query drug (Dq) and a drug to be examined (De) likely have DDI if the drugs in the interaction network of De are structurally similar to Dq. A network of De describes the associations between the drugs and the proteins relating to PK and PD for De. These include target proteins, proteins interacting with target proteins, enzymes, and transporters for De. We constructed logistic regression models for DDI prediction using only 2D structural similarities between each Dq and the drugs in the network of De. The results indicated that our models could effectively predict DDIs. It was found that integrating structural similarity scores of the drugs relating to both PK and PD of De was crucial for model performance. In particular, the combination of the target- and enzyme-related scores provided the largest increase of the predictive power.Graphical abstract.

19.

Supporting precision medicine by data mining across multi-disciplines: an integrative approach for generating comprehensive linkages between single nucleotide variants (SNVs) and drug-binding sites.

Roy Choudhury, Amrita; Cheng, Tiejun; Phan, Lon; Bryant, Stephen H; Wang, Yanli.

Bioinformatics ; 33(11): 1621-1629, 2017 Jun 01.

Article in English | MEDLINE | ID: mdl-28158543

ABSTRACT

MOTIVATION: Genetic variants in drug targets and metabolizing enzymes often have important functional implications, including altering the efficacy and toxicity of drugs. Identifying single nucleotide variants (SNVs) that contribute to differences in drug response and understanding their underlying mechanisms are fundamental to successful implementation of the precision medicine model. This work reports an effort to collect, classify and analyze SNVs that may affect the optimal response to currently approved drugs. RESULTS: An integrated approach was taken involving data mining across multiple information resources including databases containing drugs, drug targets, chemical structures, protein-ligand structure complexes, genetic and clinical variations as well as protein sequence alignment tools. We obtained 2640 SNVs of interest, most of which occur rarely in populations (minor allele frequency < 0.01). Clinical significance of only 9.56% of the SNVs is known in ClinVar, although 79.02% are predicted as deleterious. The examples here demonstrate that even if the mapped SNVs predicted as deleterious may not result in significant structural modifications, they can plausibly modify the protein-drug interactions, affecting selectivity and drug-binding affinity. Our analysis identifies potentially deleterious SNVs present on drug-binding residues that are relevant for further studies in the context of precision medicine. AVAILABILITY AND IMPLEMENTATION: Data are available from Supplementary information file. CONTACT: yanli.wang@nih.gov. SUPPLEMENTARY INFORMATION: Supplementary Tables S1-S5 are available at Bioinformatics online.

Subject(s)

Data Mining/methods , Polymorphism, Single Nucleotide , Protein Binding/genetics , Sequence Analysis, Protein/methods , Binding Sites , Gene Frequency , Humans , Precision Medicine/methods , Sequence Analysis, DNA/methods

20.

PubChem BioAssay: 2017 update.

Wang, Yanli; Bryant, Stephen H; Cheng, Tiejun; Wang, Jiyao; Gindulyte, Asta; Shoemaker, Benjamin A; Thiessen, Paul A; He, Siqian; Zhang, Jian.

Nucleic Acids Res ; 45(D1): D955-D963, 2017 01 04.

Article in English | MEDLINE | ID: mdl-27899599

ABSTRACT

PubChem's BioAssay database (https://pubchem.ncbi.nlm.nih.gov) has served as a public repository for small-molecule and RNAi screening data since 2004 providing open access of its data content to the community. PubChem accepts data submission from worldwide researchers at academia, industry and government agencies. PubChem also collaborates with other chemical biology database stakeholders with data exchange. With over a decade's development effort, it becomes an important information resource supporting drug discovery and chemical biology research. To facilitate data discovery, PubChem is integrated with all other databases at NCBI. In this work, we provide an update for the PubChem BioAssay database describing several recent development including added sources of research data, redesigned BioAssay record page, new BioAssay classification browser and new features in the Upload system facilitating data sharing.

Subject(s)

Databases, Chemical , Databases, Nucleic Acid , RNA Interference , Search Engine , Small Molecule Libraries , Drug Discovery , Gene Expression Regulation/drug effects , Humans , Software , User-Computer Interface , Web Browser

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL