Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 20
Filter
Add more filters










Publication year range
1.
Toxics ; 11(2)2023 Jan 20.
Article in English | MEDLINE | ID: mdl-36850973

ABSTRACT

Per- and polyfluoroalkyl substances (PFAS) are a diverse group of man-made chemicals that are commonly found in body tissues. The toxicokinetics of most PFAS are currently uncharacterized, but long half-lives (t½) have been observed in some cases. Knowledge of chemical-specific t½ is necessary for exposure reconstruction and extrapolation from toxicological studies. We used an ensemble machine learning method, random forest, to model the existing in vivo measured t½ across four species (human, monkey, rat, mouse) and eleven PFAS. Mechanistically motivated descriptors were examined, including two types of surrogates for renal transporters: (1) physiological descriptors, including kidney geometry, for renal transporter expression and (2) structural similarity of defluorinated PFAS to endogenous chemicals for transporter affinity. We developed a classification model for t½ (Bin 1: <12 h; Bin 2: <1 week; Bin 3: <2 months; Bin 4: >2 months). The model had an accuracy of 86.1% in contrast to 32.2% for a y-randomized null model. A total of 3890 compounds were within domain of the model, and t½ was predicted using the bin medians: 4.9 h, 2.2 days, 33 days, and 3.3 years. For human t½, 56% of PFAS were classified in Bin 4, 7% were classified in Bin 3, and 37% were classified in Bin 2. This model synthesizes the limited available data to allow tentative extrapolation and prioritization.

3.
Comput Toxicol ; 182021 May 01.
Article in English | MEDLINE | ID: mdl-34504984

ABSTRACT

Regulatory agencies world-wide face the challenge of performing risk-based prioritization of thousands of substances in commerce. In this study, a major effort was undertaken to compile a large genotoxicity dataset (54,805 records for 9299 substances) from several public sources (e.g., TOXNET, COSMOS, eChemPortal). The names and outcomes of the different assays were harmonized, and assays were annotated by type: gene mutation in Salmonella bacteria (Ames assay) and chromosome mutation (clastogenicity) in vitro or in vivo (chromosome aberration, micronucleus, and mouse lymphoma Tk +/- assays). This dataset was then evaluated to assess genotoxic potential using a categorization scheme, whereby a substance was considered genotoxic if it was positive in at least one Ames or clastogen study. The categorization dataset comprised 8442 chemicals, of which 2728 chemicals were genotoxic, 5585 were not and 129 were inconclusive. QSAR models (TEST and VEGA) and the OECD Toolbox structural alerts/profilers (e.g., OASIS DNA alerts for Ames and chromosomal aberrations) were used to make in silico predictions of genotoxicity potential. The performance of the individual QSAR tools and structural alerts resulted in balanced accuracies of 57-73%. A Naïve Bayes consensus model was developed using combinations of QSAR models and structural alert predictions. The 'best' consensus model selected had a balanced accuracy of 81.2%, a sensitivity of 87.24% and a specificity of 75.20%. This in silico scheme offers promise as a first step in ranking thousands of substances as part of a prioritization approach for genotoxicity.

5.
Environ Health Perspect ; 129(4): 47013, 2021 04.
Article in English | MEDLINE | ID: mdl-33929906

ABSTRACT

BACKGROUND: Humans are exposed to tens of thousands of chemical substances that need to be assessed for their potential toxicity. Acute systemic toxicity testing serves as the basis for regulatory hazard classification, labeling, and risk management. However, it is cost- and time-prohibitive to evaluate all new and existing chemicals using traditional rodent acute toxicity tests. In silico models built using existing data facilitate rapid acute toxicity predictions without using animals. OBJECTIVES: The U.S. Interagency Coordinating Committee on the Validation of Alternative Methods (ICCVAM) Acute Toxicity Workgroup organized an international collaboration to develop in silico models for predicting acute oral toxicity based on five different end points: Lethal Dose 50 (LD50 value, U.S. Environmental Protection Agency hazard (four) categories, Globally Harmonized System for Classification and Labeling hazard (five) categories, very toxic chemicals [LD50 (LD50≤50mg/kg)], and nontoxic chemicals (LD50>2,000mg/kg). METHODS: An acute oral toxicity data inventory for 11,992 chemicals was compiled, split into training and evaluation sets, and made available to 35 participating international research groups that submitted a total of 139 predictive models. Predictions that fell within the applicability domains of the submitted models were evaluated using external validation sets. These were then combined into consensus models to leverage strengths of individual approaches. RESULTS: The resulting consensus predictions, which leverage the collective strengths of each individual model, form the Collaborative Acute Toxicity Modeling Suite (CATMoS). CATMoS demonstrated high performance in terms of accuracy and robustness when compared with in vivo results. DISCUSSION: CATMoS is being evaluated by regulatory agencies for its utility and applicability as a potential replacement for in vivo rat acute oral toxicity studies. CATMoS predictions for more than 800,000 chemicals have been made available via the National Toxicology Program's Integrated Chemical Environment tools and data sets (ice.ntp.niehs.nih.gov). The models are also implemented in a free, standalone, open-source tool, OPERA, which allows predictions of new and untested chemicals to be made. https://doi.org/10.1289/EHP8495.


Subject(s)
Government Agencies , Animals , Computer Simulation , Rats , Toxicity Tests, Acute , United States , United States Environmental Protection Agency
6.
Sci Total Environ ; 762: 143874, 2021 Mar 25.
Article in English | MEDLINE | ID: mdl-33401053

ABSTRACT

Endocrine-disrupting chemicals have the ability to interfere with and alter functions of the hormone system, leading to adverse effects on reproduction, growth and development. Despite growing concerns over their now ubiquitous presence in the environment, endocrine-related human health effects remain largely outside of comparative human toxicity characterization frameworks as applied for example in life cycle impact assessments. In this paper, we propose a new methodological framework to consistently integrate endocrine-related health effects into comparative human toxicity characterization. We present two quantitative and operational approaches for extrapolating towards a common point of departure from both in vivo and dosimetry-adjusted in vitro endocrine-related effect data and deriving effect factors as well as corresponding characterization factors for endocrine-active/endocrine-disrupting chemicals. Following the proposed approaches, we calculated effect factors for 323 chemicals, reflecting their endocrine potency, and related characterization factors for 157 chemicals, expressing their relative endocrine-related human toxicity potential. Developed effect and characterization factors are ready for use in the context of chemical prioritization and substitution as well as life cycle impact assessment and other comparative assessment frameworks. Endocrine-related effect factors were found comparable to existing effect factors for cancer and non-cancer effects, indicating that (1) the chemicals' endocrine potency is not necessarily higher or lower than other effect potencies and (2) using dosimetry-adjusted effect data to derive effect factors does not consistently overestimate the effect of potential endocrine disruptors. Calculated characterization factors span over 8-11 orders of magnitude for different substances and emission compartments and are dominated by the range in endocrine potencies.


Subject(s)
Endocrine Disruptors , Endocrine Disruptors/toxicity , Endocrine System , Humans , Reproduction
7.
Comput Toxicol ; 20: 1-100185, 2021 Nov 01.
Article in English | MEDLINE | ID: mdl-35128218

ABSTRACT

The Toxic Substances Control Act (TSCA) became law in the U.S. in 1976 and was amended in 2016. The amended law requires the U.S. EPA to perform risk-based evaluations of existing chemicals. Here, we developed a tiered approach to screen potential candidates based on their genotoxicity and carcinogenicity information to inform the selection of candidate chemicals for prioritization under TSCA. The approach was underpinned by a large database of carcinogenicity and genotoxicity information that had been compiled from various public sources. Carcinogenicity data included weight-of-evidence human carcinogenicity evaluations and animal cancer data. Genotoxicity data included bacterial gene mutation data from the Salmonella (Ames) and Escherichia coli WP2 assays and chromosomal mutation (clastogenicity) data. Additionally, Ames and clastogenicity outcomes were predicted using the alert schemes within the OECD QSAR Toolbox and the Toxicity Estimation Software Tool (TEST). The evaluation workflows for carcinogenicity and genotoxicity were developed along with associated scoring schemes to make an overall outcome determination. For this case study, two sets of chemicals, the TSCA Active Inventory non-confidential portion list available on the EPA CompTox Chemicals Dashboard (33,364 chemicals, 'TSCA Active List') and a representative proof-of-concept (POC) set of 238 chemicals were profiled through the two workflows to make determinations of carcinogenicity and genotoxicity potential. Of the 33,364 substances on the 'TSCA Active List', overall calls could be made for 20,371 substances. Here 46.67%% (9507) of substances were non-genotoxic, 0.5% (103) were scored as inconclusive, 43.93% (8949) were predicted genotoxic and 8.9% (1812) were genotoxic. Overall calls for genotoxicity could be made for 225 of the 238 POC chemicals. Of these, 40.44% (91) were non-genotoxic, 2.67% (6) were inconclusive, 6.22% (14) were predicted genotoxic, and 50.67% (114) genotoxic. The approach shows promise as a means to identify potential candidates for prioritization from a genotoxicity and carcinogenicity perspective.

8.
Comput Toxicol ; 16(November 2020)2020 Nov 01.
Article in English | MEDLINE | ID: mdl-34017928

ABSTRACT

Human health risk assessment for environmental chemical exposure is limited by a vast majority of chemicals with little or no experimental in vivo toxicity data. Data gap filling techniques, such as quantitative structure activity relationship (QSAR) models based on chemical structure information, can predict hazard in the absence of experimental data. Risk assessment requires identification of a quantitative point-of-departure (POD) value, the point on the dose-response curve that marks the beginning of a low-dose extrapolation. This study presents two sets of QSAR models to predict POD values (PODQSAR) for repeat dose toxicity. For training and validation, a publicly available in vivo toxicity dataset for 3592 chemicals was compiled using the U.S. Environmental Protection Agency's Toxicity Value database (ToxValDB). The first set of QSAR models predict point-estimates of POD values (PODQSAR) using structural and physicochemical descriptors for repeat dose study types and species combinations. A random forest QSAR model using study type and species as descriptors showed the best performance, with an external test set root mean square error (RMSE) of 0.71 log10-mg/kg/day and coefficient of determination (R2) of 0.53. The second set of QSAR models predict the 95% confidence intervals for PODQSAR using a constructed POD distribution with a mean equal to the median POD value and a standard deviation of 0.5 log10-mg/kg/day, based on previously published typical study-to-study variability that may lead to uncertainty in model predictions. Bootstrap resampling of the pre-generated POD distribution was used to derive point-estimates and 95% confidence intervals for each POD prediction. Enrichment analysis to evaluate the accuracy of PODQSAR showed that 80% of the 5% most potent chemicals were found in the top 20% of the most potent chemical predictions, suggesting that the repeat dose POD QSAR models presented here may help inform screening level human health risk assessments in the absence of other data.

9.
Comput Toxicol ; 162020 Nov 01.
Article in English | MEDLINE | ID: mdl-34124416

ABSTRACT

The toxicokinetic (TK) parameters fraction of the chemical unbound to plasma proteins and metabolic clearance are critical for relating exposure and internal dose when building in vitro-based risk assessment models. However, experimental toxicokinetic studies have only been carried out on limited chemicals of environmental interest (~1000 chemicals with TK data relative to tens of thousands of chemicals of interest). This work evaluated the utility of chemical structure information to predict TK parameters in silico; development of cluster-based read-across and quantitative structure-activity relationship models of fraction unbound or fub (regression) and intrinsic clearance or Clint (classification and regression) using a dataset of 1487 chemicals; utilization of predicted TK parameters to estimate uncertainty in steady-state plasma concentration (Css); and subsequent in vitro-in vivo extrapolation analyses to derive bioactivity-exposure ratio (BER) plot to compare human oral equivalent doses and exposure predictions using androgen and estrogen receptor activity data for 233 chemicals as an example dataset. The results demonstrate that fub is structurally more predictable than Clint. The model with the highest observed performance for fub had an external test set RMSE/σ=0.62 and R2=0.61, for Clint classification had an external test set accuracy = 65.9%, and for intrinsic clearance regression had an external test set RMSE/σ=0.90 and R2=0.20. This relatively low performance is in part due to the large uncertainty in the underlying Clint data. We show that Css is relatively insensitive to uncertainty in Clint. The models were benchmarked against the ADMET Predictor software. Finally, the BER analysis allowed identification of 14 out of 136 chemicals for further risk assessment demonstrating the utility of these models in aiding risk-based chemical prioritization.

10.
Comput Toxicol ; 15(August 2020): 1-100126, 2020 Aug 01.
Article in English | MEDLINE | ID: mdl-33426408

ABSTRACT

New approach methodologies (NAMs) for chemical hazard assessment are often evaluated via comparison to animal studies; however, variability in animal study data limits NAM accuracy. The US EPA Toxicity Reference Database (ToxRefDB) enables consideration of variability in effect levels, including the lowest effect level (LEL) for a treatment-related effect and the lowest observable adverse effect level (LOAEL) defined by expert review, from subacute, subchronic, chronic, multi-generation reproductive, and developmental toxicity studies. The objectives of this work were to quantify the variance within systemic LEL and LOAEL values, defined as potency values for effects in adult or parental animals only, and to estimate the upper limit of NAM prediction accuracy. Multiple linear regression (MLR) and augmented cell means (ACM) models were used to quantify the total variance, and the fraction of variance in systemic LEL and LOAEL values explained by available study descriptors (e.g., administration route, study type). The MLR approach considered each study descriptor as an independent contributor to variance, whereas the ACM approach combined categorical descriptors into cells to define replicates. Using these approaches, total variance in systemic LEL and LOAEL values (in log10-mg/kg/day units) ranged from 0.74 to 0.92. Unexplained variance in LEL and LOAEL values, approximated by the residual mean square error (MSE), ranged from 0.20-0.39. Considering subchronic, chronic, or developmental study designs separately resulted in similar values. Based on the relationship between MSE and R-squared for goodness-of-fit, the maximal R-squared may approach 55 to 73% for a NAM-based predictive model of systemic toxicity using these data as reference. The root mean square error (RMSE) ranged from 0.47 to 0.63 log10-mg/kg/day, depending on dataset and regression approach, suggesting that a two-sided minimum prediction interval for systemic effect levels may have a width of 58 to 284-fold. These findings suggest quantitative considerations for building scientific confidence in NAM-based systemic toxicity predictions.

11.
Regul Toxicol Pharmacol ; 109: 104505, 2019 Dec.
Article in English | MEDLINE | ID: mdl-31639428

ABSTRACT

The Toxic Substances Control Act (TSCA) mandates the US EPA perform risk-based prioritisation of chemicals in commerce and then, for high-priority substances, develop risk evaluations that integrate toxicity data with exposure information. One approach being considered for data poor chemicals is the Threshold of Toxicological Concern (TTC). Here, TTC values derived using oral (sub)chronic No Observable (Adverse) Effect Level (NO(A)EL) data from the EPA's Toxicity Values database (ToxValDB) were compared with published TTC values from Munro et al. (1996). A total of 4554 chemicals with structures present in ToxValDB were assigned into their respective TTC categories using the Toxtree software tool, of which toxicity data was available for 1304 substances. The TTC values derived from ToxValDB were similar, but not identical to the Munro TTC values: Cramer I ((ToxValDB) 37.3 c. f. (Munro) 30 µg/kg-day), Cramer II (34.6 c. f. 9.1 µg/kg-day) and Cramer III (3.9 c. f. 1.5 µg/kg-day). Cramer III 5th percentile values were found to be statistically different. Chemical features of the two Cramer III datasets were evaluated to account for the differences. TTC values derived from this expanded dataset substantiated the original TTC values, reaffirming the utility of TTC as a promising tool in a risk-based prioritisation approach.


Subject(s)
Hazardous Substances/standards , Threshold Limit Values , Toxicology/standards , United States Environmental Protection Agency/standards , Databases, Factual , Hazardous Substances/toxicity , Humans , No-Observed-Adverse-Effect Level , Risk Assessment/standards , Software , Toxicity Tests, Chronic/standards , Toxicity Tests, Subchronic/standards , Toxicology/legislation & jurisprudence , United States
12.
Regul Toxicol Pharmacol ; 106: 278-291, 2019 Aug.
Article in English | MEDLINE | ID: mdl-31121201

ABSTRACT

Traditional approaches for chemical risk assessment cannot keep pace with the number of substances requiring assessment. Thus, in a global effort to expedite and modernize chemical risk assessment, New Approach Methodologies (NAMs) are being explored and developed. Included in this effort is the OECD Integrated Approaches for Testing and Assessment (IATA) program, which provides a forum for OECD member countries to develop and present case studies illustrating the application of NAM in various risk assessment contexts. Here, we present an IATA case study for the prediction of estrogenic potential of three target phenols: 4-tert-butylphenol, 2,4-di-tert-butylphenol and octabenzone. Key features of this IATA include the use of two computational approaches for analogue selection for read-across, data collected from traditional and NAM sources, and a workflow to generate predictions regarding the targets' ability to bind the estrogen receptor (ER). Endocrine disruption can occur when a chemical substance mimics the activity of natural estrogen by binding to the ER and, if potency and exposure are sufficient, alters the function of the endocrine system to cause adverse effects. The data indicated that of the three target substances that were considered herein, 4-tert-butylphenol is a potential endocrine disruptor. Further, this IATA illustrates that the NAM approach explored is health protective when compared to in vivo endpoints traditionally used for human health risk assessment.


Subject(s)
Benzophenones/pharmacology , Phenols/pharmacology , Receptors, Estrogen/metabolism , Benzophenones/chemistry , Humans , Molecular Structure , Phenols/chemistry , Risk Assessment
13.
J Expo Sci Environ Epidemiol ; 29(4): 557-567, 2019 06.
Article in English | MEDLINE | ID: mdl-30310133

ABSTRACT

Multi-city population-based epidemiological studies of short-term fine particulate matter (PM2.5) exposures and mortality have observed heterogeneity in risk estimates between cities. Factors affecting exposures, such as pollutant infiltration, which are not captured by central-site monitoring data, can differ between communities potentially explaining some of this heterogeneity. This analysis evaluates exposure factors as potential determinants of the heterogeneity in 312 core-based statistical areas (CBSA)-specific associations between PM2.5 and mortality using inverse variance weighted linear regression. Exposure factor variables were created based on data on housing characteristics, commuting patterns, heating fuel usage, and climatic factors from national surveys. When survey data were not available, air conditioning (AC) prevalence was predicted utilizing machine learning techniques. Across all CBSAs, there was a 0.95% (Interquartile range (IQR) of 2.25) increase in non-accidental mortality per 10 µg/m3 increase in PM2.5 and significant heterogeneity between CBSAs. CBSAs with larger homes, more heating degree days, a higher percentage of home heating with oil had significantly (p < 0.05) higher health effect estimates, while cities with more gas heating had significantly lower health effect estimates. While univariate models did not explain much of heterogeneity in health effect estimates (R2 < 1%), multivariate models began to explain some of the observed heterogeneity (R2 = 13%).


Subject(s)
Environmental Exposure , Mortality , Particulate Matter/analysis , Particulate Matter/toxicity , Adult , Air Pollutants/analysis , Air Pollution/analysis , Cities , Female , Heating , Humans , Transportation
14.
Regul Toxicol Pharmacol ; 101: 12-23, 2019 Feb.
Article in English | MEDLINE | ID: mdl-30359698

ABSTRACT

The application of toxic equivalency factors (TEFs) or toxic units to estimate toxic potencies for mixtures of chemicals which contribute to a biological effect through a common mechanism is one approach for filling data gaps. Toxic Equivalents (TEQ) have been used to express the toxicity of dioxin-like compounds (i.e., dioxins, furans, and dioxin-like polychlorinated biphenyls (PCBs)) in terms of the most toxic form of dioxin: 2,3,7,8-tetrachlorodibenzo-p-dioxin (2,3,7,8-TCDD). This study sought to integrate two data gap filling techniques, quantitative structure-activity relationships (QSARs) and TEFs, to predict neurotoxicity TEQs for PCBs. Simon et al. (2007) previously derived neurotoxic equivalent (NEQ) values for a dataset of 87 PCB congeners, of which 83 congeners had experimental data. These data were taken from a set of four different studies measuring different effects related to neurotoxicity, each of which tested overlapping subsets of the 83 PCB congeners. The goals of the current study were to: (i) evaluate an alternative neurotoxic equivalent factor (NEF) derivations from an expanded dataset, relative to those derived by Simon et al. and (ii) develop QSAR models to provide NEF estimates for the large number of untested PCB congeners. The models used multiple linear regression, support vector regression, k-nearest neighbor and random forest algorithms within a 5-fold cross validation scheme and position-specific chlorine substitution patterns on the biphenyl scaffold as descriptors. Alternative NEF values were derived but the resulting QSAR models had relatively low predictivity (RMSE ∼0.24). This was mostly driven by the large uncertainties in the underlying data and NEF values. The derived NEFs and the QSAR predicted NEFs to fill data gaps should be applied with caution.


Subject(s)
Environmental Pollutants/toxicity , Neurotoxicity Syndromes , Polychlorinated Biphenyls/toxicity , Animals , Brain/metabolism , Calcium/metabolism , Dopamine/metabolism , Environmental Pollutants/chemistry , PC12 Cells , Polychlorinated Biphenyls/chemistry , Protein Kinase C/metabolism , Quantitative Structure-Activity Relationship , Rats , Risk Assessment , Ryanodine Receptor Calcium Release Channel/metabolism
15.
Regul Toxicol Pharmacol ; 85: 108-118, 2017 Apr.
Article in English | MEDLINE | ID: mdl-28137642

ABSTRACT

Silver nanoparticles (AgNP) are incorporated into medical devices for their anti-microbial characteristics. The potential exposure and toxicity of AgNPs is unknown due to varying physicochemical particle properties and lack of toxicological data. The aim of this safety assessment is to derive a provisional tolerable intake (pTI) value for AgNPs released from blood-contacting medical devices. A literature review of in vivo studies investigating critical health effects induced from intravenous (i. v.) exposure to AgNPs was evaluated by the Annapolis Accords principles and Toxicological Data Reliability Assessment Tool (ToxRTool). The point of departure (POD) was based on an i. v. 28-day repeated AgNP (20 nm) dose toxicity study reporting an increase in relative spleen weight in rats with a 5% lower confidence bound of the benchmark dose (BMDL05) of 0.14 mg/kg bw/day. The POD was extrapolated to humans by a modifying factor of 1,000 to account for intraspecies variability, interspecies differences and lack of long-term toxicity data. The pTI for long-term i. v. exposure to 20 nm AgNPs released from blood-contacting medical devices was 0.14 µg/kg bw/day. This pTI may not be appropriate for nanoparticles of other physicochemical properties or routes of administration. The methodology is appropriate for deriving pTIs for nanoparticles in general.


Subject(s)
Equipment and Supplies , Metal Nanoparticles/toxicity , Silver/toxicity , Administration, Intravenous , Animals , Female , Humans , Male , Metal Nanoparticles/administration & dosage , Mice , No-Observed-Adverse-Effect Level , Rabbits , Rats , Risk Assessment , Silver/administration & dosage , Species Specificity , Uncertainty
16.
Comput Toxicol ; 4: 22-30, 2017 Nov 01.
Article in English | MEDLINE | ID: mdl-30057968

ABSTRACT

Read-across is an important data gap filling technique used within category and analog approaches for regulatory hazard identification and risk assessment. Although much technical guidance is available that describes how to develop category/analog approaches, practical principles to evaluate and substantiate analog validity (suitability) are still lacking. This case study uses hindered phenols as an example chemical class to determine: (1) the capability of three structure fingerprint/descriptor methods (PubChem, ToxPrints and MoSS MCSS) to identify analogs for read-across to predict Estrogen Receptor (ER) binding activity and, (2) the utility of data confidence measures, physicochemical properties, and chemical R-group properties as filters to improve ER binding predictions. The training dataset comprised 462 hindered phenols and 257 non- hindered phenols. For each chemical of interest (target), source analogs were identified from two datasets (hindered and non-hindered phenols) that had been characterized by a fingerprint/descriptor method and by two cut-offs: (1) minimum similarity distance (range: 0.1 - 0.9) and, (2) N closest analogs (range: 1 - 10). Analogs were then filtered using: (1) physicochemical properties of the phenol (termed global filtering) and, (2) physicochemical properties of the R-groups neighboring the active hydroxyl group (termed local filtering). A read-across prediction was made for each target chemical on the basis of a majority vote of the N closest analogs. The results demonstrate that: (1) concordance in ER activity increases with structural similarity, regardless of the structure fingerprint/descriptor method, (2) increased data confidence significantly improves read-across predictions, and (3) filtering analogs using global and local properties can help identify more suitable analogs. This case study illustrates that the quality of the underlying experimental data and use of endpoint relevant chemical descriptors to evaluate source analogs are critical to achieving robust read-across predictions.

17.
J Cheminform ; 8: 48, 2016.
Article in English | MEDLINE | ID: mdl-28316646

ABSTRACT

Quantitative structure activity relationships (QSARs) are theoretical models that relate a quantitative measure of chemical structure to a physical property or a biological effect. QSAR predictions can be used for chemical risk assessment for protection of human and environmental health, which makes them interesting to regulators, especially in the absence of experimental data. For compatibility with regulatory use, QSAR models should be transparent, reproducible and optimized to minimize the number of false negatives. In silico QSAR tools are gaining wide acceptance as a faster alternative to otherwise time-consuming clinical and animal testing methods. However, different QSAR tools often make conflicting predictions for a given chemical and may also vary in their predictive performance across different chemical datasets. In a regulatory context, conflicting predictions raise interpretation, validation and adequacy concerns. To address these concerns, ensemble learning techniques in the machine learning paradigm can be used to integrate predictions from multiple tools. By leveraging various underlying QSAR algorithms and training datasets, the resulting consensus prediction should yield better overall predictive ability. We present a novel ensemble QSAR model using Bayesian classification. The model allows for varying a cut-off parameter that allows for a selection in the desirable trade-off between model sensitivity and specificity. The predictive performance of the ensemble model is compared with four in silico tools (Toxtree, Lazar, OECD Toolbox, and Danish QSAR) to predict carcinogenicity for a dataset of air toxins (332 chemicals) and a subset of the gold carcinogenic potency database (480 chemicals). Leave-one-out cross validation results show that the ensemble model achieves the best trade-off between sensitivity and specificity (accuracy: 83.8 % and 80.4 %, and balanced accuracy: 80.6 % and 80.8 %) and highest inter-rater agreement [kappa (κ): 0.63 and 0.62] for both the datasets. The ROC curves demonstrate the utility of the cut-off feature in the predictive ability of the ensemble model. This feature provides an additional control to the regulators in grading a chemical based on the severity of the toxic endpoint under study.

18.
Article in English | MEDLINE | ID: mdl-26671816

ABSTRACT

Molecular docking is a computational technique which predicts the binding energy and the preferred binding mode of a ligand to a protein target. Virtual screening is a tool which uses docking to investigate large chemical libraries to identify ligands that bind favorably to a protein target. We have developed a novel scoring based distributed protein docking application to improve enrichment in virtual screening. The application addresses the issue of time and cost of screening in contrast to conventional systematic parallel virtual screening methods in two ways. Firstly, it automates the process of creating and launching multiple independent dockings on a high performance computing cluster. Secondly, it uses a Nȧi̇ve Bayes scoring function to calculate binding energy of un-docked ligands to identify and preferentially dock (Autodock predicted) better binders. The application was tested on four proteins using a library of 10,573 ligands. In all the experiments, (i). 200 of the 1,000 best binders are identified after docking only ~14 percent of the chemical library, (ii). 9 or 10 best-binders are identified after docking only ~19 percent of the chemical library, and (iii). no significant enrichment is observed after docking ~70 percent of the chemical library. The results show significant increase in enrichment of potential drug leads in early rounds of virtual screening.


Subject(s)
Algorithms , Models, Chemical , Molecular Docking Simulation/methods , Proteins/chemistry , Proteins/ultrastructure , Bayes Theorem , Binding Sites , Computer Simulation , Protein Binding
19.
Stat Med ; 34(25): 3362-75, 2015 Nov 10.
Article in English | MEDLINE | ID: mdl-26112310

ABSTRACT

Many gene expression data are based on two experiments where the gene expressions of the targeted genes under both experiments are correlated. We consider problems in which objectives are to find genes that are simultaneously upregulated/downregulated under both experiments. A Bayesian methodology is proposed based on directional multiple hypotheses testing. We propose a false discovery rate specific to the problem under consideration, and construct a Bayes rule satisfying a false discovery rate criterion. The proposed method is compared with a traditional rule through simulation studies. We apply our methodology to two real examples involving microRNAs; where in one example the targeted genes are simultaneously downregulated under both experiments, and in the other the targeted genes are downregulated in one experiment and upregulated in the other experiment. We also discuss how the proposed methodology can be extended to more than two experiments.


Subject(s)
Bayes Theorem , Gene Expression Profiling/methods , Gene Expression Regulation , MicroRNAs/genetics , Models, Statistical , Algorithms , Computer Simulation , Databases, Genetic , Down-Regulation/genetics , Humans , Up-Regulation/genetics
20.
Mol Inform ; 34(4): 236-45, 2015 04.
Article in English | MEDLINE | ID: mdl-27490169

ABSTRACT

The availability of large in vitro datasets enables better insight into the mode of action of chemicals and better identification of potential mechanism(s) of toxicity. Several studies have shown that not all in vitro assays can contribute as equal predictors of in vivo carcinogenicity for development of hybrid Quantitative Structure Activity Relationship (QSAR) models. We propose two novel approaches for the use of mechanistically relevant in vitro assay data in the identification of relevant biological descriptors and development of Quantitative Biological Activity Relationship (QBAR) models for carcinogenicity prediction. We demonstrate that in vitro assay data can be used to develop QBAR models for in vivo carcinogenicity prediction via two case studies corroborated with firm scientific rationale. The case studies demonstrate the similarities between QBAR and QSAR modeling in: (i) the selection of relevant descriptors to be used in the machine learning algorithm, and (ii) the development of a computational model that maps chemical or biological descriptors to a toxic endpoint. The results of both the case studies show: (i) improved accuracy and sensitivity which is especially desirable under regulatory requirements, and (ii) overall adherence with the OECD/REACH guidelines. Such mechanism based models can be used along with QSAR models for prediction of mechanistically complex toxic endpoints.


Subject(s)
Carcinogens/toxicity , Databases, Factual , Machine Learning , Models, Biological , Carcinogens/chemistry
SELECTION OF CITATIONS
SEARCH DETAIL
...