Search | VHL Regional Portal

1.

An integrated and rapid evaluation of Curcumae Radix from different botanical origins based on chemical components, antiplatelet aggregation effect and Fourier transform near-infrared spectroscopy.

Wang, Meng; Hu, Tingting; Li, Yuhang; Wang, Rui; Xu, Yudie; Shi, Yabo; Tong, Huangjin; Yu, Mengting; Qin, Yuwen; Mei, Xi; Su, Lianlin; Mao, Chunqin; Lu, Tulin; Li, Lin; Ji, De; Jiang, Chengxi.

Spectrochim Acta A Mol Biomol Spectrosc ; 324: 124992, 2025 Jan 05.

Article in English | MEDLINE | ID: mdl-39163771

ABSTRACT

Curcumae Radix (CR) is a widely used traditional Chinese medicine with significant pharmaceutical importance, including enhancing blood circulation and addressing blood stasis. This study aims to establish an integrated and rapid quality assessment method for CR from various botanical origins, based on chemical components, antiplatelet aggregation effects, and Fourier transform near-infrared (FT-NIR) spectroscopy combined with multivariate algorithms. Firstly, ultra-performance liquid chromatography-photodiode array (UPLC-PDA) combined with chemometric analyses was used to examine variations in the chemical profiles of CR. Secondly, the activation effect on blood circulation of CR was assessed using an in vitro antiplatelet aggregation assay. The studies revealed significant variations in chemical profiles and antiplatelet aggregation effects among CR samples from different botanical origins, with constituents such as germacrone, ß-elemene, bisdemethoxycurcumin, demethoxycurcumin, and curcumin showing a positive correlation with antiplatelet aggregation biopotency. Thirdly, FT-NIR spectroscopy was integrated with various machine learning algorithms, including Artificial Neural Network (ANN), K-Nearest Neighbors (KNN), Logistic Regression (LR), Support Vector Machine (SVM), and Subspace K-Nearest Neighbors (Subspace KNN), to classify CR samples from four distinct sources. The result showed that FT-NIR combined with KNN and SVM classification algorithms after SNV and MSC preprocessing successfully distinguished CR samples from four plant sources with an accuracy of 100%. Finally, Quantitative models for active constituents and antiplatelet aggregation bioactivity were developed by optimizing the partial least squares (PLS) model with interval combination optimization (ICO) and competitive adaptive reweighted sampling (CARS) techniques. The CARS-PLS model achieved the best predictive performance across all five components. The coefficient of determination (R2p) and root mean square error (RMSEP) in the independent test sets were 0.9708 and 0.2098, 0.8744 and 0.2065, 0.9511 and 0.0034, 0.9803 and 0.0066, 0.9567 and 0.0172 for germacrone, ß-elemene, bisdemethoxycurcumin, demethoxycurcumin and curcumin, respectively. The ICO-PLS model demonstrated superior predictive capabilities for antiplatelet aggregation biotency, achieving an R2p of 0.9010, and an RMSEP of 0.5370. This study provides a valuable reference for the quality evaluation of CR in a more rapid and comprehensive manner.

Subject(s)

Curcuma , Platelet Aggregation Inhibitors , Platelet Aggregation , Spectroscopy, Near-Infrared , Curcuma/chemistry , Spectroscopy, Near-Infrared/methods , Platelet Aggregation/drug effects , Spectroscopy, Fourier Transform Infrared/methods , Platelet Aggregation Inhibitors/analysis , Platelet Aggregation Inhibitors/chemistry , Animals , Chromatography, High Pressure Liquid/methods , Drugs, Chinese Herbal/chemistry , Drugs, Chinese Herbal/analysis , Algorithms , Plant Extracts/chemistry

2.

Toward Increasing the Credibility of RNA Design.

Antczak, Maciej; Szachniuk, Marta.

Methods Mol Biol ; 2847: 137-151, 2025.

Article in English | MEDLINE | ID: mdl-39312141

ABSTRACT

In the problem of RNA design, also known as inverse folding, RNA sequences are predicted that achieve the desired secondary structure at the lowest possible free energy and under certain constraints. The designed sequences have applications in synthetic biology and RNA-based nanotechnologies. There are also known cases of the successful use of inverse folding to discover previously unknown noncoding RNAs. Several computational methods have been dedicated to the problem of RNA design. They differ by algorithm and additional parameters, e.g., those determining the goal function in the sequence optimization process. Users can obtain many promising RNA sequences quite easily. The more difficult issue is to critically evaluate them and select the most favorable and reliable sequence that form1s the expected RNA structure. The latter problem is addressed in this paper. We propose an RNA design protocol extended to include sequence evaluation, for which a 3D structure is used. Experiments show that the accuracy of RNA design can be improved by adding a 3D structure prediction and analysis step.

Subject(s)

Algorithms , Computational Biology , Nucleic Acid Conformation , RNA Folding , RNA , RNA/chemistry , RNA/genetics , Computational Biology/methods , Software , Models, Molecular , Synthetic Biology/methods

3.

Generative Modeling of RNA Sequence Families with Restricted Boltzmann Machines.

Fernandez-de-Cossio-Diaz, Jorge.

Methods Mol Biol ; 2847: 163-175, 2025.

Article in English | MEDLINE | ID: mdl-39312143

ABSTRACT

In this chapter, we discuss the potential application of Restricted Boltzmann machines (RBM) to model sequence families of structured RNA molecules. RBMs are a simple two-layer machine learning model able to capture intricate sequence dependencies induced by secondary and tertiary structure, as well as mechanisms of structural flexibility, resulting in a model that can be successfully used for the design of allosteric RNA such as riboswitches. They have recently been experimentally validated as generative models for the SAM-I riboswitch aptamer domain sequence family. We introduce RBM mathematically and practically, providing self-contained code examples to download the necessary training sequence data, train the RBM, and sample novel sequences. We present in detail the implementation of algorithms necessary to use RBMs, focusing on applications in biological sequence modeling.

Subject(s)

Algorithms , Machine Learning , Nucleic Acid Conformation , RNA , Riboswitch , RNA/chemistry , RNA/genetics , Riboswitch/genetics , Computational Biology/methods , Models, Molecular , Software

4.

Spatial differentiation of carbon emissions from energy consumption based on machine learning algorithm: A case study during 2015-2020 in Shaanxi, China.

Cao, Hongye; Han, Ling; Liu, Ming; Li, Liangzhi.

J Environ Sci (China) ; 149: 358-373, 2025 Mar.

Article in English | MEDLINE | ID: mdl-39181649

ABSTRACT

Carbon emissions resulting from energy consumption have become a pressing issue for governments worldwide. Accurate estimation of carbon emissions using satellite remote sensing data has become a crucial research problem. Previous studies relied on statistical regression models that failed to capture the complex nonlinear relationships between carbon emissions and characteristic variables. In this study, we propose a machine learning algorithm for carbon emissions, a Bayesian optimized XGboost regression model, using multi-year energy carbon emission data and nighttime lights (NTL) remote sensing data from Shaanxi Province, China. Our results demonstrate that the XGboost algorithm outperforms linear regression and four other machine learning models, with an R2 of 0.906 and RMSE of 5.687. We observe an annual increase in carbon emissions, with high-emission counties primarily concentrated in northern and central Shaanxi Province, displaying a shift from discrete, sporadic points to contiguous, extended spatial distribution. Spatial autocorrelation clustering reveals predominantly high-high and low-low clustering patterns, with economically developed counties showing high-emission clustering and economically relatively backward counties displaying low-emission clustering. Our findings show that the use of NTL data and the XGboost algorithm can estimate and predict carbon emissions more accurately and provide a complementary reference for satellite remote sensing image data to serve carbon emission monitoring and assessment. This research provides an important theoretical basis for formulating practical carbon emission reduction policies and contributes to the development of techniques for accurate carbon emission estimation using remote sensing data.

Subject(s)

Algorithms , Environmental Monitoring , Machine Learning , China , Environmental Monitoring/methods , Air Pollutants/analysis , Carbon/analysis , Bayes Theorem , Remote Sensing Technology , Air Pollution/statistics & numerical data , Air Pollution/analysis

5.

Integration of MALDI-TOF MS and machine learning to classify enterococci: A comparative analysis of supervised learning algorithms for species prediction.

Kim, Eiseul; Yang, Seung-Min; Ham, Jun-Hyeok; Lee, Woojung; Jung, Dae-Hyun; Kim, Hae-Yeong.

Food Chem ; 462: 140931, 2025 Jan 01.

Article in English | MEDLINE | ID: mdl-39217752

ABSTRACT

This research focused on distinguishing distinct matrix assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) spectral signatures of three Enterococcus species. We evaluated and compared the predictive performance of four supervised machine learning algorithms, K-nearest neighbor (KNN), support vector machine (SVM), and random forest (RF), to accurately classify Enterococcus species. This study involved a comprehensive dataset of 410 strains, generating 1640 individual spectra through on-plate and off-plate protein extraction methods. Although the commercial database correctly identified 76.9% of the strains, machine learning classifiers demonstrated superior performance (accuracy 0.991). In the RF model, top informative peaks played a significant role in the classification. Whole-genome sequencing showed that the most informative peaks are biomarkers connected to proteins, which are essential for understanding bacterial classification and evolution. The integration of MALDI-TOF MS and machine learning provides a rapid and accurate method for identifying Enterococcus species, improving healthcare and food safety.

Subject(s)

Enterococcus , Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization , Supervised Machine Learning , Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization/methods , Enterococcus/classification , Enterococcus/chemistry , Enterococcus/isolation & purification , Enterococcus/genetics , Algorithms , Support Vector Machine , Bacterial Typing Techniques/methods , Machine Learning

6.

Systematic Inference of Multi-scale Chromatin Sub-compartments Using Calder2.

Liu, Yuanlong.

Methods Mol Biol ; 2856: 213-221, 2025.

Article in English | MEDLINE | ID: mdl-39283454

ABSTRACT

The compartmentalization of chromatin reflects its underlying biological activities. Inferring chromatin sub-compartments using Hi-C data is challenged by data resolution constraints. Consequently, comprehensive characterizations of sub-compartments have been limited to a select number of Hi-C experiments, with systematic comparisons across a wide range of tissues and conditions still lacking. Our original Calder algorithm marked a significant advancement in this field, enabling the identification of multi-scale sub-compartments at various data resolutions and facilitating the inference and comparison of chromatin architecture in over 100 datasets. Building on this foundation, we introduce Calder2, an updated version of Calder that brings notable improvements. These include expanded support for a wider array of genomes and organisms, an optimized bin size selection approach for more accurate chromatin compartment detection, and extended support for input and output formats. Calder2 thus stands as a refined analysis tool, significantly advancing genome-wide studies of 3D chromatin architecture and its functional implications.

Subject(s)

Algorithms , Chromatin , Software , Chromatin/genetics , Chromatin/metabolism , Computational Biology/methods , Humans , Animals

7.

Reconstruction of 3D Chromosome Structure from Single-Cell Hi-C Data via Recurrence Plots.

Hirata, Yoshito; Sugishita, Hiroki; Gotoh, Yukiko.

Methods Mol Biol ; 2856: 263-268, 2025.

Article in English | MEDLINE | ID: mdl-39283457

ABSTRACT

We describe an approach for reconstructing three-dimensional (3D) structures from single-cell Hi-C data. This approach has been inspired by a method of recurrence plots and visualization tools for nonlinear time series data. Some examples are also presented.

Subject(s)

Single-Cell Analysis , Single-Cell Analysis/methods , Imaging, Three-Dimensional/methods , Humans , Software , Chromosomes/genetics , Algorithms

8.

4D Genome Analysis Using PHi-C2.

Shinkai, Soya; Onami, Shuichi.

Methods Mol Biol ; 2856: 271-279, 2025.

Article in English | MEDLINE | ID: mdl-39283458

ABSTRACT

Hi-C methods reveal 3D genome features but lack correspondence to dynamic chromatin behavior. PHi-C2, Python software, addresses this gap by transforming Hi-C data into polymer models. After the optimization algorithm, it enables us to calculate 3D conformations and conduct dynamic simulations, providing insights into chromatin dynamics, including the mean-squared displacement and rheological properties. This chapter introduces PHi-C2 usage, offering a tutorial for comprehensive 4D genome analysis.

Subject(s)

Algorithms , Chromatin , Software , Chromatin/genetics , Chromatin/chemistry , Chromatin/metabolism , Humans , Genomics/methods , Genome , Computational Biology/methods

9.

Learning Enhancer-Gene associations from Bulk Transcriptomic and Epigenetic Sequencing Data with STITCHIT.

Rumpf, Laura; Schulz, Marcel H.

Methods Mol Biol ; 2856: 341-356, 2025.

Article in English | MEDLINE | ID: mdl-39283463

ABSTRACT

To reveal gene regulation mechanisms, it is essential to understand the role of regulatory elements, which are possibly distant from gene promoters. Integrative analysis of epigenetic and transcriptomic data can be used to gain insights into gene-expression regulation in specific phenotypes. Here, we discuss STITCHIT, an approach to dissect epigenetic variation in a gene-specific manner across many samples for the identification of regulatory elements without relying on peak calling algorithms. The obtained genomic regions are then further refined using a regularized linear model approach, which can also be used to predict gene expression. We illustrate the use of STITCHIT using H3k27ac ChIP-seq and RNA-seq data from the International Human Epigenome Consortium (IHEC).

Subject(s)

Epigenesis, Genetic , Epigenomics , Transcriptome , Humans , Epigenomics/methods , Transcriptome/genetics , Enhancer Elements, Genetic , Software , Computational Biology/methods , Chromatin Immunoprecipitation Sequencing/methods , Gene Expression Regulation , Algorithms , Histones/genetics , Histones/metabolism , Gene Expression Profiling/methods

10.

Sequence Design for RNA-RNA Interactions.

Waldl, Maria; Yao, Hua-Ting; Hofacker, Ivo L.

Methods Mol Biol ; 2847: 1-16, 2025.

Article in English | MEDLINE | ID: mdl-39312133

ABSTRACT

The design of RNA sequences with desired structural properties presents a challenging computational problem with promising applications in biotechnology and biomedicine. Most regulatory RNAs function by forming RNA-RNA interactions, e.g., in order to regulate mRNA expression. It is therefore natural to consider problems where a sequence is designed to form a desired RNA-RNA interaction and switch between structures upon binding. This contribution demonstrates the use of the Infrared framework to design interacting sequences. Specifically, we consider the regulation of the rpoS mRNA by the sRNA DsrA and design artificial 5 ' UTRs that place a downstream protein coding gene under control of DsrA. The design process is explained step by step in a Jupyter notebook, accompanied by Python code. The text discusses setting up design constraints for sampling sequences in Infrared, computing quality measures, constructing a suitable cost function, as well as the optimization procedure. We show that not only thermodynamic but also kinetic folding features can be relevant. Kinetics of interaction formation can be estimated efficiently using the RRIkinDP tool, and the chapter explains how to include kinetic folding features from RRIkinDP directly in the cost function. The protocol implemented in our Jupyter notebook can easily be extended to consider additional requirements or adapted to novel design scenarios.

Subject(s)

Nucleic Acid Conformation , Thermodynamics , Computational Biology/methods , Software , Kinetics , RNA/genetics , RNA/chemistry , RNA/metabolism , 5' Untranslated Regions , RNA, Messenger/genetics , RNA, Messenger/chemistry , RNA, Messenger/metabolism , Algorithms , RNA Folding

11.

Riboswitch Design Using MODENA.

Taneda, Akito.

Methods Mol Biol ; 2847: 33-43, 2025.

Article in English | MEDLINE | ID: mdl-39312135

ABSTRACT

In silico design of artificial riboswitches is a challenging and intriguing task. Since experimental approaches such as in vitro selection are time-consuming processes, computational tools that guide riboswitch design are desirable to accelerate the design process. In this chapter, we describe the usage of the MODENA web server to design ON riboswitches on the basis of a multi-objective genetic algorithm and RNA secondary structure prediction.

Subject(s)

Algorithms , Computational Biology , Nucleic Acid Conformation , Riboswitch , Software , Computational Biology/methods

12.

Sequence Design Using RNAstructure.

Zhu, Mingyi; Mathews, David H.

Methods Mol Biol ; 2847: 17-31, 2025.

Article in English | MEDLINE | ID: mdl-39312134

ABSTRACT

RNA is present in all domains of life. It was once thought to be solely involved in protein expression, but recent advances have revealed its crucial role in catalysis and gene regulation through noncoding RNA. With a growing interest in exploring RNAs with specific structures, there is an increasing focus on designing RNA structures for in vivo and in vitro experimentation and for therapeutics. The development of RNA secondary structure prediction methods has also spurred the growth of RNA design software. However, there are challenges to designing RNA sequences that meet secondary structure requirements. One major challenge is that the secondary structure design problem is likely NP-hard, making it computationally intensive. Another issue is that objective functions need to consider the folding ensemble of RNA molecules to avoid off target structures. In this chapter, we provide protocols for two software tools from the RNAstructure package: "Design" for structured RNA sequence design and "orega" for unstructured RNA sequence design.

Subject(s)

Computational Biology , Nucleic Acid Conformation , RNA , Software , RNA/chemistry , RNA/genetics , Computational Biology/methods , RNA Folding , Sequence Analysis, RNA/methods , Algorithms

13.

Machine Learning for RNA Design: LEARNA.

Runge, Frederic; Hutter, Frank.

Methods Mol Biol ; 2847: 63-93, 2025.

Article in English | MEDLINE | ID: mdl-39312137

ABSTRACT

Machine learning algorithms, and in particular deep learning approaches, have recently garnered attention in the field of molecular biology due to remarkable results. In this chapter, we describe machine learning approaches specifically developed for the design of RNAs, with a focus on the learna_tools Python package, a collection of automated deep reinforcement learning algorithms for secondary structure-based RNA design. We explain the basic concepts of reinforcement learning and its extension, automated reinforcement learning, and outline how these concepts can be successfully applied to the design of RNAs. The chapter is structured to guide through the usage of the different programs with explicit examples, highlighting particular applications of the individual tools.

Subject(s)

Algorithms , Machine Learning , Nucleic Acid Conformation , RNA , Software , RNA/chemistry , RNA/genetics , Computational Biology/methods , Deep Learning

14.

RNA Design Using incaRNAfbinv Demonstrated with the Identification of Functional RNA Motifs in Hepatitis Delta Virus.

Zakh, Rami; Churkin, Alexander; Barash, Danny.

Methods Mol Biol ; 2847: 109-120, 2025.

Article in English | MEDLINE | ID: mdl-39312139

ABSTRACT

Computational RNA design was introduced in the 1990s by Vienna's RNAinverse, which is a simple inverse RNA folding solver. Further developments and contemporary RNA design techniques, in addition to improved efficiency, offer more precise control over the designed sequences. incaRNAfbinv (incaRNAtion with RNA fragment-based inverse) is one such extension that builds upon RNAinverse and includes coarse-graining manipulations. The idea is that an RNA secondary structure can be decomposed to fragments of RNA motifs, and that a significant number of known natural RNA motifs exhibit a remarkable preservation in particular locations in a variety of genomes. This is taken into consideration by the ability of the user to select motifs that are known to be functional for a precise design, whilst the algorithm is more adaptable on other motifs. The latest version, incaRNAfbinv 2.0, is a free-to-use web-server which deploys the above methodology of fragment-based design. Its control over the decomposed RNA secondary structure motifs includes, among other advanced features, the insertion of constraints in a flexible manner. The resultant RNA designed sequences are ranked by their proximity to classical RNA design. Features and capabilities of incaRNAfbinv 2.0 are hereby illustrated with an example taken from hepatitis delta virus (HDV). The web-server is demonstrated in assisting to locate a known RNA motif that is responsible for HDV-3 RNA editing in more HDV genotypes than thought of before. This shows that computational RNA design by using inverse RNA folding is also a valuable strategy for locating functional RNA motifs in genomic data, in addition to artificially designing synthetic RNAs.

Subject(s)

Hepatitis Delta Virus , Nucleic Acid Conformation , Nucleotide Motifs , RNA, Viral , Hepatitis Delta Virus/genetics , RNA, Viral/genetics , RNA, Viral/chemistry , Nucleotide Motifs/genetics , Algorithms , Computational Biology/methods , Software , RNA Folding

15.

Simulated Annealing for RNA Design with SIMARD.

Tsang, Herbert H.

Methods Mol Biol ; 2847: 95-108, 2025.

Article in English | MEDLINE | ID: mdl-39312138

ABSTRACT

Ribonucleic acid (RNA) design is the inverse of RNA folding. RNA folding aims to identify the most likely secondary structure into which a given strand of nucleotides will fold. RNA design algorithms, on the other hand, attempt to design a strand of nucleotides that will fold into a specified secondary structure. Despite the apparent NP-hard nature of RNA design, promising results can be achieved when formulated as a combinatorial optimization problem and approached with simple heuristics. The main focus of this paper is to describe an RNA design algorithm based on simulated annealing. Additionally, noteworthy features and results will be presented herein.

Subject(s)

Algorithms , Nucleic Acid Conformation , RNA Folding , RNA , RNA/chemistry , RNA/genetics , Software , Computational Biology/methods , Computer Simulation

16.

Monte Carlo Inverse RNA Folding.

Cazenave, Tristan; Touzani, Hamza.

Methods Mol Biol ; 2847: 205-215, 2025.

Article in English | MEDLINE | ID: mdl-39312146

ABSTRACT

The inverse RNA folding problem deals with designing a sequence of nucleotides that will fold into a desired target structure. Generalized Nested Rollout Policy Adaptation (GNRPA) is a Monte Carlo search algorithm for optimizing a sequence of choices. It learns a playout policy to intensify the search of the state space near the current best sequence. The algorithm uses a prior on the possible actions so as to perform non uniform playouts when learning the instance of problem at hand. We trained a transformer neural network on the inverse RNA folding problem using the Rfam database. This network is used to generate a prior for every Eterna100 puzzle. GNRPA is used with this prior to solve some of the instances of the Eterna100 dataset. The transformer prior gives better result than handcrafted heuristics.

Subject(s)

Algorithms , Monte Carlo Method , RNA Folding , RNA , RNA/chemistry , RNA/genetics , Nucleic Acid Conformation , Neural Networks, Computer , Computational Biology/methods

17.

SHAPE Probing to Screen Computationally Designed RNA.

Hardouin, Pierre; Lyonnet du Moutier, Francois-Xavier; Sargueil, Bruno.

Methods Mol Biol ; 2847: 177-191, 2025.

Article in English | MEDLINE | ID: mdl-39312144

ABSTRACT

RNA design is a major challenge for the future development of synthetic biology and RNA-based therapy. The development of efficient and accurate RNA design pipelines is based on trial and error strategies. The fast progression of such algorithms requires assaying the properties of many RNA sequences in a short time frame. High throughput RNA structure chemical probing technologies such as SHAPE-MaP allow for assaying RNA structure and interaction rapidly and at a very large scale. However, the promiscuity of the designed sequences that may differ only by one nucleotide requires special care. In addition, it necessitates the analysis and evaluation of many experimental results that may reveal to be very tedious. Here we propose an experimental and analytical workflow that eases the screening of thousands of designed RNA sequences at once. In particular, we have developed shapemap tools a customized software suite available at https://github.com/sargueil-citcom/shapemap-tools .

Subject(s)

Algorithms , Computational Biology , Nucleic Acid Conformation , RNA , Software , RNA/chemistry , RNA/genetics , Computational Biology/methods , Synthetic Biology/methods

18.

Datasets for Benchmarking RNA Design Algorithms.

Badura, Jan; Zok, Tomasz; Rybarczyk, Agnieszka.

Methods Mol Biol ; 2847: 229-240, 2025.

Article in English | MEDLINE | ID: mdl-39312148

ABSTRACT

RNA molecules play vital roles in many biological processes, such as gene regulation or protein synthesis. The adoption of a specific secondary and tertiary structure by RNA is essential to perform these diverse functions, making RNA a popular tool in bioengineering therapeutics. The field of RNA design responds to the need to develop novel RNA molecules that possess specific functional attributes. In recent years, computational tools for predicting RNA sequences with desired folding characteristics have improved and expanded. However, there is still a lack of well-defined and standardized datasets to assess these programs. Here, we present a large dataset of internal and multibranched loops extracted from PDB-deposited RNA structures that encompass a wide spectrum of design difficulties. Furthermore, we conducted benchmarking tests of widely utilized open-source RNA design algorithms employing this dataset.

Subject(s)

Algorithms , Benchmarking , Computational Biology , Nucleic Acid Conformation , RNA , RNA/genetics , RNA/chemistry , Computational Biology/methods , Software

19.

Molecular Similarity in Predictive Toxicology with a Focus on the q-RASAR Technique.

Banerjee, Arkaprava; Roy, Kunal.

Methods Mol Biol ; 2834: 41-63, 2025.

Article in English | MEDLINE | ID: mdl-39312159

ABSTRACT

The concept of similarity is an important aspect in various in silico-based prediction approaches. Most of these approaches follow the basic similarity property principle that states that two or more compounds having a high level of similarity are expected to exert similar biological activity or physicochemical property. Although in some cases this principle fails to predict the biological activity or property efficiently for certain compounds, it is applicable to most of the compounds in a given dataset. With the emerging need to efficiently fill data gaps in the regulatory context, Read-Across (RA), a similarity-based approach, has gained popularity, since this is not a statistical approach like QSAR, which requires a sizeable amount of data points to train a meaningful model. The basic idea behind Read-Across is the identification of the close source neighbors, and based on the similarity considerations, predictions are made for the query compound. Although RA is originally an unsupervised prediction method, recent efforts for quantitative Read-Across (qRA) have introduced supervised similarity-based weightage for quantitative predictions. RA is a useful tool in predictive toxicology, but one of its important drawbacks is the lack of interpretability of the features (especially for q-RA) used to generate the Read-Across-based predictions. To bridge this gap, a novel quantitative Read-Across Structure-Activity Relationship (q-RASAR) approach has recently been proposed, which combines the concepts of QSAR and Read-Across, generating statistically reliable and predictive models using similarity and error-based descriptors. The q-RASAR models are simple and interpretable and can be efficiently used to identify not only the essential features but also the nature of the source and query compounds. In this chapter, we have discussed the concepts and various studies on RA, q-RA, and q-RASAR along with some of the tools available from different research groups.

Subject(s)

Quantitative Structure-Activity Relationship , Computer Simulation , Toxicology/methods , Algorithms , Humans , Computational Biology/methods , Software

20.

QSAR: Using the Past to Study the Present.

Gini, Giuseppina C.

Methods Mol Biol ; 2834: 3-39, 2025.

Article in English | MEDLINE | ID: mdl-39312158

ABSTRACT

Quantitative structure-activity relationships (QSAR) is a method for predicting the physical and biological properties of small molecules; it is in use in industry and public services. However, as any scientific method, it is challenged by more and more requests, especially considering its possible role in assessing the safety of new chemicals. To answer the question whether QSAR, by exploiting available knowledge, can build new knowledge, the chapter reviews QSAR methods in search of a QSAR epistemology. QSAR stands on tree pillars, i.e., biological data, chemical knowledge, and modeling algorithms. Usually the biological data, resulting from good experimental practice, are taken as a true picture of the world; chemical knowledge has scientific bases; so if a QSAR model is not working, blame modeling. The role of modeling in developing scientific theories, and in producing knowledge, is so analyzed. QSAR is a mature technology and is part of a large body of in silico methods and other computational methods. The active debate about the acceptability of the QSAR models, about the way to communicate them, and the explanation to provide accompanies the development of today QSAR models. An example about predicting possible endocrine-disrupting chemicals (EDC) shows the many faces of modern QSAR methods.

Subject(s)

Quantitative Structure-Activity Relationship , Algorithms , Humans , Endocrine Disruptors/chemistry

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL