Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 33
Filter
1.
Front Toxicol ; 6: 1393662, 2024.
Article in English | MEDLINE | ID: mdl-38800806

ABSTRACT

To study the ways in which compounds can induce adverse effects, toxicologists have been constructing Adverse Outcome Pathways (AOPs). An AOP can be considered as a pragmatic tool to capture and visualize mechanisms underlying different types of toxicity inflicted by any kind of stressor, and describes the interactions between key entities that lead to the adverse outcome on multiple biological levels of organization. The construction or optimization of an AOP is a labor intensive process, which currently depends on the manual search, collection, reviewing and synthesis of available scientific literature. This process could however be largely facilitated using Natural Language Processing (NLP) to extract information contained in scientific literature in a systematic, objective, and rapid manner that would lead to greater accuracy and reproducibility. This would support researchers to invest their expertise in the substantive assessment of the AOPs by replacing the time spent on evidence gathering by a critical review of the data extracted by NLP. As case examples, we selected two frequent adversities observed in the liver: namely, cholestasis and steatosis denoting accumulation of bile and lipid, respectively. We used deep learning language models to recognize entities of interest in text and establish causal relationships between them. We demonstrate how an NLP pipeline combining Named Entity Recognition and a simple rules-based relationship extraction model helps screen compounds related to liver adversities in the literature, but also extract mechanistic information for how such adversities develop, from the molecular to the organismal level. Finally, we provide some perspectives opened by the recent progress in Large Language Models and how these could be used in the future. We propose this work brings two main contributions: 1) a proof-of-concept that NLP can support the extraction of information from text for modern toxicology and 2) a template open-source model for recognition of toxicological entities and extraction of their relationships. All resources are openly accessible via GitHub (https://github.com/ontox-project/en-tox).

2.
Environ Pollut ; 352: 124109, 2024 Jul 01.
Article in English | MEDLINE | ID: mdl-38718961

ABSTRACT

Exposure assessment is a crucial component of environmental health research, providing essential information on the potential risks associated with various chemicals. A systematic scoping review was conducted to acquire an overview of accessible human exposure assessment methods and computational tools to support and ultimately improve risk assessment. The systematic scoping review was performed in Sysrev, a web platform that introduces machine learning techniques into the review process aiming for increased accuracy and efficiency. Included publications were restricted to a publication date after the year 2000, where exposure methods were properly described. Exposure assessments methods were found to be used for a broad range of environmental chemicals including pesticides, metals, persistent chemicals, volatile organic compounds, and other chemical classes. Our results show that after the year 2000, for all the types of exposure routes, probabilistic analysis, and computational methods to calculate human exposure have increased. Sixty-three mathematical models and toolboxes were identified that have been developed in Europe, North America, and globally. However, only twelve occur frequently and their usefulness were associated with exposure route, chemical classes and input parameters used to estimate exposure. The outcome of the combined associations can function as a basis and/or guide for decision making for the selection of most appropriate method and tool to be used for environmental chemical human exposure assessments in Ontology-driven and artificial intelligence-based repeated dose toxicity testing of chemicals for next generation risk assessment (ONTOX) project and elsewhere. Finally, the choice of input parameters used in each mathematical model and toolbox shown by our analysis can contribute to the harmonization process of the exposure models and tools increasing the prospect for comparison between studies and consistency in the regulatory process in the future.


Subject(s)
Environmental Exposure , Environmental Pollutants , Humans , Environmental Exposure/statistics & numerical data , Environmental Monitoring/methods , Machine Learning , Pesticides/toxicity , Risk Assessment/methods
4.
Biomed Pharmacother ; 174: 116530, 2024 May.
Article in English | MEDLINE | ID: mdl-38574623

ABSTRACT

BACKGROUND: Serum transaminases, alkaline phosphatase and bilirubin are common parameters used for DILI diagnosis, classification, and prognosis. However, the relevance of clinical examination, histopathology and drug chemical properties have not been fully investigated. As cholestasis is a frequent and complex DILI manifestation, our goal was to investigate the relevance of clinical features and drug properties to stratify drug-induced cholestasis (DIC) patients, and to develop a prognosis model to identify patients at risk and high-concern drugs. METHODS: DIC-related articles were searched by keywords and Boolean operators in seven databases. Relevant articles were uploaded onto Sysrev, a machine-learning based platform for article review and data extraction. Demographic, clinical, biochemical, and liver histopathological data were collected. Drug properties were obtained from databases or QSAR modelling. Statistical analyses and logistic regressions were performed. RESULTS: Data from 432 DIC patients associated with 52 drugs were collected. Fibrosis strongly associated with fatality, whereas canalicular paucity and ALP associated with chronicity. Drugs causing cholestasis clustered in three major groups. The pure cholestatic pattern divided into two subphenotypes with differences in prognosis, canalicular paucity, fibrosis, ALP and bilirubin. A predictive model of DIC outcome based on non-invasive parameters and drug properties was developed. Results demonstrate that physicochemical (pKa-a) and pharmacokinetic (bioavailability, CYP2C9) attributes impinged on the DIC phenotype and allowed the identification of high-concern drugs. CONCLUSIONS: We identified novel associations among DIC manifestations and disclosed novel DIC subphenotypes with specific clinical and chemical traits. The developed predictive DIC outcome model could facilitate DIC prognosis in clinical practice and drug categorization.


Subject(s)
Cholestasis , Machine Learning , Phenotype , Humans , Chemical and Drug Induced Liver Injury/diagnosis , Chemical and Drug Induced Liver Injury/etiology , Cholestasis/chemically induced , Databases, Factual , Prognosis
5.
ALTEX ; 41(1): 3-19, 2024.
Article in English | MEDLINE | ID: mdl-38194639

ABSTRACT

Green toxicology is marching chemistry into the 21st century. This emerging framework will transform how chemical safety is evaluated by incorporating evaluation of the hazards, exposures, and risks associated with chemicals into early product development in a way that minimizes adverse impacts on human and environmental health. The goal is to minimize toxic threats across entire supply chains through smarter designs and policies. Traditional animal testing methods are replaced by faster, cutting-edge innovations like organs-on-chips and artificial intelligence predictive models that are also more cost-effective. Core principles of green toxicology include utilizing alternative test methods, applying the precautionary principle, considering lifetime impacts, and emphasizing risk prevention over reaction. This paper provides an overview of these foundational concepts and describes current initiatives and future opportunities to advance the adoption of green toxicology approaches. Chal-lenges and limitations are also discussed. Green shoots are emerging with governments offering carrots like the European Green Deal to nudge industry. Noteworthy, animal rights and environ-mental groups have different ideas about the needs for testing and their consequences for animal use. Green toxicology represents the way forward to support both these societal needs with sufficient throughput and human relevance for hazard information and minimal animal suffering. Green toxi-cology thus sets the stage to synergize human health and ecological values. Overall, the integration of green chemistry and toxicology has potential to profoundly shift how chemical risks are evaluated and managed to achieve safety goals in a more ethical, ecologically-conscious manner.


Green toxicology aims to make chemicals safer by design. It focuses on preventing toxicity issues early during development instead of testing after products are developed. Green toxicology uses modern non-animal methods like computer models and lab tests with human cells to predict if chem­icals could be hazardous. Benefits are faster results, lower costs, and less animal testing. The principles of green toxicology include using alternative tests, applying caution even with uncertain data, con­sidering lifetime impacts across global supply chains, and emphasizing prevention over reaction. The article highlights European and US policy efforts to spur sustainable chemistry innovation which will necessitate greener approaches to assess new materials and drive adoption. Overall, green toxi­cology seeks to integrate safer design concepts so that human and environmental health are valued equally with functionality and profit. This alignment promises safer, ethical products but faces chal­lenges around validating new methods and overcoming institutional resistance to change.


Subject(s)
Artificial Intelligence , Chemical Safety , Animals , Humans , Animal Testing Alternatives , Environmental Health , Industry
6.
J Biomed Inform ; 145: 104465, 2023 09.
Article in English | MEDLINE | ID: mdl-37541407

ABSTRACT

BACKGROUND: Adverse outcome pathway (AOP) networks are versatile tools in toxicology and risk assessment that capture and visualize mechanisms driving toxicity originating from various data sources. They share a common structure consisting of a set of molecular initiating events and key events, connected by key event relationships, leading to the actual adverse outcome. AOP networks are to be considered living documents that should be frequently updated by feeding in new data. Such iterative optimization exercises are typically done manually, which not only is a time-consuming effort, but also bears the risk of overlooking critical data. The present study introduces a novel approach for AOP network optimization of a previously published AOP network on chemical-induced cholestasis using artificial intelligence to facilitate automated data collection followed by subsequent quantitative confidence assessment of molecular initiating events, key events, and key event relationships. METHODS: Artificial intelligence-assisted data collection was performed by means of the free web platform Sysrev. Confidence levels of the tailored Bradford-Hill criteria were quantified for the purpose of weight-of-evidence assessment of the optimized AOP network. Scores were calculated for biological plausibility, empirical evidence, and essentiality, and were integrated into a total key event relationship confidence value. The optimized AOP network was visualized using Cytoscape with the node size representing the incidence of the key event and the edge size indicating the total confidence in the key event relationship. RESULTS: This resulted in the identification of 38 and 135 unique key events and key event relationships, respectively. Transporter changes was the key event with the highest incidence, and formed the most confident key event relationship with the adverse outcome, cholestasis. Other important key events present in the AOP network include: nuclear receptor changes, intracellular bile acid accumulation, bile acid synthesis changes, oxidative stress, inflammation and apoptosis. CONCLUSIONS: This process led to the creation of an extensively informative AOP network focused on chemical-induced cholestasis. This optimized AOP network may serve as a mechanistic compass for the development of a battery of in vitro assays to reliably predict chemical-induced cholestatic injury.


Subject(s)
Adverse Outcome Pathways , Cholestasis , Humans , Artificial Intelligence , Cholestasis/chemically induced , Risk Assessment , Data Collection
7.
Front Artif Intell ; 5: 984836, 2022.
Article in English | MEDLINE | ID: mdl-36171797

ABSTRACT

Recent metastatic castration-resistant prostate cancer (mCRPC) clinical trials have integrated homologous recombination and DNA repair deficiency (HRD/DRD) biomarkers into eligibility criteria and secondary objectives. These trials led to the approval of some PARP inhibitors for mCRPC with HRD/DRD indications. Unfortunately, biomarker-trial outcome data is only discovered by reviewing publications, a process that is error-prone, time-consuming, and laborious. While prostate cancer researchers have written systematic evidence reviews (SERs) on this topic, given the time involved from the last search to publication, an SER is often outdated even before publication. The difficulty in reusing previous review data has resulted in multiple reviews of the same trials. Thus, it will be useful to create a normalized evidence base from recently published/presented biomarker-trial outcome data that one can quickly update. We present a new approach to semi-automating normalized, open-access data tables from published clinical trials of metastatic prostate cancer using a data curation and SER platform. Clinicaltrials.gov and Pubmed.gov were used to collect mCRPC clinical trial publications with HRD/DRD biomarkers. We extracted data from 13 publications covering ten trials that started before 22nd Apr 2021. We extracted 585 hazard ratios, response rates, duration metrics, and 543 adverse events. Across 334 patients, we also extracted 8,180 patient-level survival and biomarker values. Data tables were populated with survival metrics, raw patient data, eligibility criteria, adverse events, and timelines. A repeated strong association between HRD and improved PARP inhibitor response was observed. Several use cases for the extracted data are demonstrated via analyses of trial methods, comparison of treatment hazard ratios, and association of treatments with adverse events. Machine learning models are also built on combined and normalized patient data to demonstrate automated discovery of therapy/biomarker relationships. Overall, we demonstrate the value of systematically extracted and normalized data. We have also made our code open-source with simple instructions on updating the analyses as new data becomes available, which anyone can use even with limited programming knowledge. Finally, while we present a novel method of SER for mCRPC trials, one can also implement such semi-automated methods in other clinical trial domains to advance precision medicine.

8.
ALTEX ; 39(1): 3-29, 2022.
Article in English | MEDLINE | ID: mdl-35034131

ABSTRACT

Safety sciences must cope with uncertainty of models and results as well as information gaps. Acknowledging this uncer-tainty necessitates embracing probabilities and accepting the remaining risk. Every toxicological tool delivers only probable results. Traditionally, this is taken into account by using uncertainty / assessment factors and worst-case / precautionary approaches and thresholds. Probabilistic methods and Bayesian approaches seek to characterize these uncertainties and promise to support better risk assessment and, thereby, improve risk management decisions. Actual assessments of uncertainty can be more realistic than worst-case scenarios and may allow less conservative safety margins. Most importantly, as soon as we agree on uncertainty, this defines room for improvement and allows a transition from traditional to new approach methods as an engineering exercise. The objective nature of these mathematical tools allows to assign each methodology its fair place in evidence integration, whether in the context of risk assessment, sys-tematic reviews, or in the definition of an integrated testing strategy (ITS) / defined approach (DA) / integrated approach to testing and assessment (IATA). This article gives an overview of methods for probabilistic risk assessment and their application for exposure assessment, physiologically-based kinetic modelling, probability of hazard assessment (based on quantitative and read-across based structure-activity relationships, and mechanistic alerts from in vitro studies), indi-vidual susceptibility assessment, and evidence integration. Additional aspects are opportunities for uncertainty analysis of adverse outcome pathways and their relation to thresholds of toxicological concern. In conclusion, probabilistic risk assessment will be key for constructing a new toxicology paradigm - probably!


Subject(s)
Toxicology , Bayes Theorem , Risk Assessment , Uncertainty
10.
Front Artif Intell ; 4: 685298, 2021.
Article in English | MEDLINE | ID: mdl-34423285

ABSTRACT

Well-curated datasets are essential to evidence based decision making and to the integration of artificial intelligence with human reasoning across disciplines. However, many sources of data remain siloed, unstructured, and/or unavailable for complementary and secondary research. Sysrev was developed to address these issues. First, Sysrev was built to aid in systematic evidence reviews (SER), where digital documents are evaluated according to a well defined process, and where Sysrev provides an easy to access, publicly available and free platform for collaborating in SER projects. Secondly, Sysrev addresses the issue of unstructured, siloed, and inaccessible data in the context of generalized data extraction, where human and machine learning algorithms are combined to extract insights and evidence for better decision making across disciplines. Sysrev uses FAIR - Findability, Accessibility, Interoperability, and Reuse of digital assets - as primary principles in design. Sysrev was developed primarily because of an observed need to reduce redundancy, reduce inefficient use of human time and increase the impact of evidence based decision making. This publication is an introduction to Sysrev as a novel technology, with an overview of the features, motivations and use cases of the tool. Methods: Sysrev. com is a FAIR motivated web platform for data curation and SER. Sysrev allows users to create data curation projects called "sysrevs" wherein users upload documents, define review tasks, recruit reviewers, perform review tasks, and automate review tasks. Conclusion: Sysrev is a web application designed to facilitate data curation and SERs. Thousands of publicly accessible Sysrev projects have been created, accommodating research in a wide variety of disciplines. Described use cases include data curation, managed reviews, and SERs.

12.
Toxicology ; 458: 152846, 2021 06 30.
Article in English | MEDLINE | ID: mdl-34216698

ABSTRACT

The 3Rs concept, calling for replacement, reduction and refinement of animal experimentation, is receiving increasing attention around the world, and has found its way to legislation, in particular in the European Union. This is aligned by continuing high-level efforts of the European Commission to support development and implementation of 3Rs methods. In this respect, the European project called "ONTOX: ontology-driven and artificial intelligence-based repeated dose toxicity testing of chemicals for next generation risk assessment" was recently initiated with the goal to provide a functional and sustainable solution for advancing human risk assessment of chemicals without the use of animals in line with the principles of 21st century toxicity testing and next generation risk assessment. ONTOX will deliver a generic strategy to create new approach methodologies (NAMs) in order to predict systemic repeated dose toxicity effects that, upon combination with tailored exposure assessment, will enable human risk assessment. For proof-of-concept purposes, focus is put on NAMs addressing adversities in the liver, kidneys and developing brain induced by a variety of chemicals. The NAMs each consist of a computational system based on artificial intelligence and are fed by biological, toxicological, chemical and kinetic data. Data are consecutively integrated in physiological maps, quantitative adverse outcome pathway networks and ontology frameworks. Supported by artificial intelligence, data gaps are identified and are filled by targeted in vitro and in silico testing. ONTOX is anticipated to have a deep and long-lasting impact at many levels, in particular by consolidating Europe's world-leading position regarding the development, exploitation, regulation and application of animal-free methods for human risk assessment of chemicals.


Subject(s)
Artificial Intelligence , Gene Ontology , Toxicity Tests , Animal Testing Alternatives , Animals , Computer Simulation , European Union , Humans , In Vitro Techniques , Risk Assessment
13.
Environ Health Perspect ; 129(4): 47013, 2021 04.
Article in English | MEDLINE | ID: mdl-33929906

ABSTRACT

BACKGROUND: Humans are exposed to tens of thousands of chemical substances that need to be assessed for their potential toxicity. Acute systemic toxicity testing serves as the basis for regulatory hazard classification, labeling, and risk management. However, it is cost- and time-prohibitive to evaluate all new and existing chemicals using traditional rodent acute toxicity tests. In silico models built using existing data facilitate rapid acute toxicity predictions without using animals. OBJECTIVES: The U.S. Interagency Coordinating Committee on the Validation of Alternative Methods (ICCVAM) Acute Toxicity Workgroup organized an international collaboration to develop in silico models for predicting acute oral toxicity based on five different end points: Lethal Dose 50 (LD50 value, U.S. Environmental Protection Agency hazard (four) categories, Globally Harmonized System for Classification and Labeling hazard (five) categories, very toxic chemicals [LD50 (LD50≤50mg/kg)], and nontoxic chemicals (LD50>2,000mg/kg). METHODS: An acute oral toxicity data inventory for 11,992 chemicals was compiled, split into training and evaluation sets, and made available to 35 participating international research groups that submitted a total of 139 predictive models. Predictions that fell within the applicability domains of the submitted models were evaluated using external validation sets. These were then combined into consensus models to leverage strengths of individual approaches. RESULTS: The resulting consensus predictions, which leverage the collective strengths of each individual model, form the Collaborative Acute Toxicity Modeling Suite (CATMoS). CATMoS demonstrated high performance in terms of accuracy and robustness when compared with in vivo results. DISCUSSION: CATMoS is being evaluated by regulatory agencies for its utility and applicability as a potential replacement for in vivo rat acute oral toxicity studies. CATMoS predictions for more than 800,000 chemicals have been made available via the National Toxicology Program's Integrated Chemical Environment tools and data sets (ice.ntp.niehs.nih.gov). The models are also implemented in a free, standalone, open-source tool, OPERA, which allows predictions of new and untested chemicals to be made. https://doi.org/10.1289/EHP8495.


Subject(s)
Government Agencies , Animals , Computer Simulation , Rats , Toxicity Tests, Acute , United States , United States Environmental Protection Agency
14.
Sci Rep ; 10(1): 8886, 2020 06 01.
Article in English | MEDLINE | ID: mdl-32483272

ABSTRACT

This paper examines the effect of TET1 expression on survival in glioma patients using open-access data from the Genomic Data Commons. A neural network-based survival model was built on expression data from a selection of genes most affected by TET1 knockdown with a median cross-validated survival concordance of 82.5%. A synthetic experiment was then conducted that linked two separately trained neural networks: a multitask model estimating cancer hallmark gene expression from TET1 expression, and a survival neural network. This experiment quantified the mediation of the TET1 survival effect through eight cancer hallmarks: apoptosis, cell cycle, cell death, cell motility, DNA repair, immune response, two phosphorylation pathways, and a randomized gene sets. Immune response, DNA repair, and apoptosis displayed greater mediation than the randomized gene set. Cell motility was inversely associated with only 12.5% mediated concordance. We propose the neural network linkage mediation experiment as an approach to collecting evidence of hazard mediation relationships with prognostic capacity useful for designing interventions.


Subject(s)
Brain Neoplasms/mortality , Gene Regulatory Networks , Glioma/mortality , Mixed Function Oxygenases/genetics , Proto-Oncogene Proteins/genetics , Brain Neoplasms/genetics , Databases, Genetic , Gene Expression Profiling , Gene Expression Regulation, Neoplastic , Gene Knockdown Techniques , Glioma/genetics , Humans , Mutation , Neural Networks, Computer , Sequence Analysis, RNA , Survival Analysis
15.
Sci Rep ; 10(1): 9718, 2020 Jun 11.
Article in English | MEDLINE | ID: mdl-32528098

ABSTRACT

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

16.
Sci Rep ; 10(1): 4106, 2020 03 05.
Article in English | MEDLINE | ID: mdl-32139709

ABSTRACT

Cancer is a comparatively well-studied disease, yet despite decades of intense focus, we demonstrate here using data from The Cancer Genome Atlas that a substantial number of genes implicated in cancer are relatively poorly studied. Those genes will likely be missed by any data analysis pipeline, such as enrichment analysis, that depends exclusively on annotations for understanding biological function. There is no indication that the amount of research - indicated by number of publications - is correlated with any objective metric of gene significance. Moreover, these genes are not missing at random but reflect that our information about genes is gathered in a biased manner: poorly studied genes are more likely to be primate-specific and less likely to have a Mendelian inheritance pattern, and they tend to cluster in some biological processes and not others. While this likely reflects both technological limitations as well as the fact that well-known genes tend to gather more interest from the research community, in the absence of a concerted effort to study genes in an unbiased way, many genes (and biological processes) will remain opaque.


Subject(s)
Neoplasms/genetics , Bibliometrics , Genes, Neoplasm , Genetic Association Studies , Genome, Human , Humans , Molecular Sequence Annotation
17.
Front Med (Lausanne) ; 6: 122, 2019.
Article in English | MEDLINE | ID: mdl-31214592

ABSTRACT

Experimental therapeutic oncology agents are often combined to circumvent tumor resistance to individual agents. However, most combination trials fail to demonstrate sufficient safety and efficacy to advance to a later phase. This study collected survey data on phase 1 combination therapy trials identified from ClinicalTrials.gov between January 1, 2003 and November 30, 2017 to assess trial design and the progress of combinations toward regulatory approval. Online surveys (N = 289, 23 questions total) were emailed to Principal Investigators (PIs) of early-phase National Cancer Institute and/or industry trials; 263 emails (91%) were received and 113 surveys completed (43%). Among phase 1 combination trials, 24.9% (95%CI: 15.3%, 34.4%) progressed to phase 2 or further; 18.7% (95%CI: 5.90%, 31.4%) progressed to phase 3 or regulatory approval; and 12.4% (95%CI: 0.00%, 25.5%) achieved regulatory approval. Observations of "clinical promise" in phase 1 combination studies were associated with higher rates of advancement past each milestone toward regulatory approval (cumulative OR = 11.9; p = 0.0002). Phase 1 combination study designs were concordant with Clinical Trial Design Task Force (CTD-TF) Recommendations 79.6% of the time (95%CI: 72.2%, 87.1%). Most discordances occurred where no plausible pharmacokinetic or pharmacodynamic interactions were expected. Investigator-defined "clinical promise" of a combination is associated with progress toward regulatory approval. Although concordance between study designs of phase 1 combination trials and CTD-TF Recommendations was relatively high, it may be beneficial to raise awareness about the best study design to use when no plausible pharmacokinetic or pharmacodynamic interactions are expected.

19.
Toxicol Res (Camb) ; 7(5): 732-744, 2018 Sep 01.
Article in English | MEDLINE | ID: mdl-30310652

ABSTRACT

The creation of large toxicological databases and advances in machine-learning techniques have empowered computational approaches in toxicology. Work with these large databases based on regulatory data has allowed reproducibility assessment of animal models, which highlight weaknesses in traditional in vivo methods. This should lower the bars for the introduction of new approaches and represents a benchmark that is achievable for any alternative method validated against these methods. Quantitative Structure Activity Relationships (QSAR) models for skin sensitization, eye irritation, and other human health hazards based on these big databases, however, also have made apparent some of the challenges facing computational modeling, including validation challenges, model interpretation issues, and model selection issues. A first implementation of machine learning-based predictions termed REACHacross achieved unprecedented sensitivities of >80% with specificities >70% in predicting the six most common acute and topical hazards covering about two thirds of the chemical universe. While this is awaiting formal validation, it demonstrates the new quality introduced by big data and modern data-mining technologies. The rapid increase in the diversity and number of computational models, as well as the data they are based on, create challenges and opportunities for the use of computational methods.

20.
Toxicol Sci ; 165(1): 198-212, 2018 09 01.
Article in English | MEDLINE | ID: mdl-30007363

ABSTRACT

Earlier we created a chemical hazard database via natural language processing of dossiers submitted to the European Chemical Agency with approximately 10 000 chemicals. We identified repeat OECD guideline tests to establish reproducibility of acute oral and dermal toxicity, eye and skin irritation, mutagenicity and skin sensitization. Based on 350-700+ chemicals each, the probability that an OECD guideline animal test would output the same result in a repeat test was 78%-96% (sensitivity 50%-87%). An expanded database with more than 866 000 chemical properties/hazards was used as training data and to model health hazards and chemical properties. The constructed models automate and extend the read-across method of chemical classification. The novel models called RASARs (read-across structure activity relationship) use binary fingerprints and Jaccard distance to define chemical similarity. A large chemical similarity adjacency matrix is constructed from this similarity metric and is used to derive feature vectors for supervised learning. We show results on 9 health hazards from 2 kinds of RASARs-"Simple" and "Data Fusion". The "Simple" RASAR seeks to duplicate the traditional read-across method, predicting hazard from chemical analogs with known hazard data. The "Data Fusion" RASAR extends this concept by creating large feature vectors from all available property data rather than only the modeled hazard. Simple RASAR models tested in cross-validation achieve 70%-80% balanced accuracies with constraints on tested compounds. Cross validation of data fusion RASARs show balanced accuracies in the 80%-95% range across 9 health hazards with no constraints on tested compounds.


Subject(s)
Animal Testing Alternatives , Data Mining , Databases, Chemical , Hazardous Substances , Machine Learning , Animals , Big Data , Hazardous Substances/chemistry , Hazardous Substances/toxicity , Humans , Models, Theoretical , Reproducibility of Results , Sensitivity and Specificity , Structure-Activity Relationship
SELECTION OF CITATIONS
SEARCH DETAIL
...