Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 96.167
Filter
1.
Brief Bioinform ; 25(4)2024 May 23.
Article in English | MEDLINE | ID: mdl-38828640

ABSTRACT

Cell hashing, a nucleotide barcode-based method that allows users to pool multiple samples and demultiplex in downstream analysis, has gained widespread popularity in single-cell sequencing due to its compatibility, simplicity, and cost-effectiveness. Despite these advantages, the performance of this method remains unsatisfactory under certain circumstances, especially in experiments that have imbalanced sample sizes or use many hashtag antibodies. Here, we introduce a hybrid demultiplexing strategy that increases accuracy and cell recovery in multi-sample single-cell experiments. This approach correlates the results of cell hashing and genetic variant clustering, enabling precise and efficient cell identity determination without additional experimental costs or efforts. In addition, we developed HTOreader, a demultiplexing tool for cell hashing that improves the accuracy of cut-off calling by avoiding the dominance of negative signals in experiments with many hashtags or imbalanced sample sizes. When compared to existing methods using real-world datasets, this hybrid approach and HTOreader consistently generate reliable results with increased accuracy and cell recovery.


Subject(s)
Single-Cell Analysis , Single-Cell Analysis/methods , Humans , Algorithms , Software , High-Throughput Nucleotide Sequencing/methods , Computational Biology/methods
2.
BMC Genomics ; 25(1): 549, 2024 Jun 01.
Article in English | MEDLINE | ID: mdl-38824509

ABSTRACT

BACKGROUND: Despite Spirochetales being a ubiquitous and medically important order of bacteria infecting both humans and animals, there is extremely limited information regarding their bacteriophages. Of the genus Treponema, there is just a single reported characterised prophage. RESULTS: We applied a bioinformatic approach on 24 previously published Treponema genomes to identify and characterise putative treponemal prophages. Thirteen of the genomes did not contain any detectable prophage regions. The remaining eleven contained 38 prophage sequences, with between one and eight putative prophages in each bacterial genome. The prophage regions ranged from 12.4 to 75.1 kb, with between 27 and 171 protein coding sequences. Phylogenetic analysis revealed that 24 of the prophages formed three distinct sequence clusters, identifying putative myoviral and siphoviral morphology. ViPTree analysis demonstrated that the identified sequences were novel when compared to known double stranded DNA bacteriophage genomes. CONCLUSIONS: In this study, we have started to address the knowledge gap on treponeme bacteriophages by characterising 38 prophage sequences in 24 treponeme genomes. Using bioinformatic approaches, we have been able to identify and compare the prophage-like elements with respect to other bacteriophages, their gene content, and their potential to be a functional and inducible bacteriophage, which in turn can help focus our attention on specific prophages to investigate further.


Subject(s)
Genome, Bacterial , Genomics , Phylogeny , Prophages , Treponema , Prophages/genetics , Treponema/genetics , Treponema/virology , Genomics/methods , Computational Biology/methods , Genome, Viral , Bacteriophages/genetics , Bacteriophages/classification
3.
BMC Bioinformatics ; 25(1): 204, 2024 Jun 01.
Article in English | MEDLINE | ID: mdl-38824535

ABSTRACT

BACKGROUND: Protein solubility is a critically important physicochemical property closely related to protein expression. For example, it is one of the main factors to be considered in the design and production of antibody drugs and a prerequisite for realizing various protein functions. Although several solubility prediction models have emerged in recent years, many of these models are limited to capturing information embedded in one-dimensional amino acid sequences, resulting in unsatisfactory predictive performance. RESULTS: In this study, we introduce a novel Graph Attention network-based protein Solubility model, GATSol, which represents the 3D structure of proteins as a protein graph. In addition to the node features of amino acids extracted by the state-of-the-art protein large language model, GATSol utilizes amino acid distance maps generated using the latest AlphaFold technology. Rigorous testing on independent eSOL and the Saccharomyces cerevisiae test datasets has shown that GATSol outperforms most recently introduced models, especially with respect to the coefficient of determination R2, which reaches 0.517 and 0.424, respectively. It outperforms the current state-of-the-art GraphSol by 18.4% on the S. cerevisiae_test set. CONCLUSIONS: GATSol captures 3D dimensional features of proteins by building protein graphs, which significantly improves the accuracy of protein solubility prediction. Recent advances in protein structure modeling allow our method to incorporate spatial structure features extracted from predicted structures into the model by relying only on the input of protein sequences, which simplifies the entire graph neural network prediction process, making it more user-friendly and efficient. As a result, GATSol may help prioritize highly soluble proteins, ultimately reducing the cost and effort of experimental work. The source code and data of the GATSol model are freely available at https://github.com/binbinbinv/GATSol .


Subject(s)
Proteins , Solubility , Proteins/chemistry , Proteins/metabolism , Protein Conformation , Databases, Protein , Computational Biology/methods , Software , Saccharomyces cerevisiae/metabolism , Saccharomyces cerevisiae/chemistry , Algorithms , Models, Molecular , Amino Acid Sequence
4.
Oncol Res ; 32(6): 1011-1019, 2024.
Article in English | MEDLINE | ID: mdl-38827323

ABSTRACT

This review aimed to describe the inculpation of microRNAs (miRNAs) in thyroid cancer (TC) and its subtypes, mainly medullary thyroid carcinoma (MTC), and to outline web-based tools and databases for bioinformatics analysis of miRNAs in TC. Additionally, the capacity of miRNAs to serve as therapeutic targets and biomarkers in TC management will be discussed. This review is based on a literature search of relevant articles on the role of miRNAs in TC and its subtypes, mainly MTC. Additionally, web-based tools and databases for bioinformatics analysis of miRNAs in TC were identified and described. MiRNAs can perform as oncomiRs or antioncoges, relying on the target mRNAs they regulate. MiRNA replacement therapy using miRNA mimics or antimiRs that aim to suppress the function of certain miRNAs can be applied to correct miRNAs aberrantly expressed in diseases, particularly in cancer. MiRNAs are involved in the modulation of fundamental pathways related to cancer, resembling cell cycle checkpoints and DNA repair pathways. MiRNAs are also rather stable and can reliably be detected in different types of biological materials, rendering them favorable diagnosis and prognosis biomarkers as well. MiRNAs have emerged as promising tools for evaluating medical outcomes in TC and as possible therapeutic targets. The contribution of miRNAs in thyroid cancer, particularly MTC, is an active area of research, and the utility of web applications and databases for the biological data analysis of miRNAs in TC is becoming increasingly important.


Subject(s)
Biomarkers, Tumor , Carcinoma, Neuroendocrine , Computational Biology , MicroRNAs , Thyroid Neoplasms , Humans , Thyroid Neoplasms/genetics , Thyroid Neoplasms/diagnosis , Thyroid Neoplasms/therapy , Thyroid Neoplasms/pathology , MicroRNAs/genetics , Biomarkers, Tumor/genetics , Carcinoma, Neuroendocrine/genetics , Carcinoma, Neuroendocrine/pathology , Carcinoma, Neuroendocrine/diagnosis , Prognosis , Computational Biology/methods , Gene Expression Regulation, Neoplastic , Internet , Molecular Targeted Therapy
5.
Eur J Med Res ; 29(1): 307, 2024 Jun 02.
Article in English | MEDLINE | ID: mdl-38825674

ABSTRACT

BACKGROUND: Tumor necrosis factor receptor-associated factors family genes play a pivotal role in tumorigenesis and metastasis, functioning as adapters or E3 ubiquitin ligases across various signaling pathways. To date, limited research has explored the association between tumor necrosis factor receptor-associated factors family genes and the clinicopathological characteristics of tumors, immunity, and the tumor microenvironment (TME). This comprehensive study investigates the relationship between tumor necrosis factor receptor-associated factors family and prognosis, TME, immune response, and drug sensitivity in a pan-cancer context. METHODS: Utilizing current public databases, this study examines the expression levels and prognostic significance of tumor necrosis factor receptor-associated factors family genes in a pan-cancer context through bioinformatic analysis. In addition, it investigates the correlation between tumor necrosis factor receptor-associated factors expression and various factors, including the TME, immune subtypes, stemness scores, and drug sensitivity in pan-cancer. RESULTS: Elevated expression levels of tumor necrosis factor receptor-associated factor 2, 3, 4, and 7 were observed across various cancer types. Patients exhibiting high expression of these genes generally faced a worse prognosis. Furthermore, a significant correlation was noted between the expression of tumor necrosis factor receptor-associated factors family genes and multiple dimensions of the TME, immune subtypes, and drug sensitivity.


Subject(s)
Neoplasms , Tumor Microenvironment , Humans , Prognosis , Neoplasms/genetics , Neoplasms/drug therapy , Tumor Microenvironment/genetics , Tumor Microenvironment/immunology , Tumor Necrosis Factor Receptor-Associated Peptides and Proteins/genetics , Gene Expression Regulation, Neoplastic , Computational Biology/methods , Drug Resistance, Neoplasm/genetics , Biomarkers, Tumor/genetics
6.
PeerJ ; 12: e17280, 2024.
Article in English | MEDLINE | ID: mdl-38827298

ABSTRACT

Cuproptosis-related key genes play a significant role in the pathological processes of acute myocardial infarction (AMI). However, a complete understanding of the molecular mechanisms behind this participation remains elusive. This study was designed to identify genes and immune cells critical to AMI pathogenesis. Based on the GSE48060 dataset (31 AMI patients and 21 healthy persons, GPL570-55999), we identified genes associated with dysregulated cuproptosis and the activation of immune responses between normal subjects and patients with a first myocardial attack. Two molecular clusters associated with cuproptosis were defined in patients with AMI. Immune infiltration analysis showed that there was significant immunity heterogeneity among different clusters. Multiple immune responses were closely associated with Cluster2-specific differentially expressed genes (DEGs). The generalized linear model machine model presented the best discriminative performance with relatively lower residual and root mean square error, and a higher area under the curve (AUC = 0.870). A final two-gene-based generalized linear model was constructed, exhibiting satisfactory performance in two external validation datasets (AUC = 0.719, GSE66360 and AUC = 0.856, GSE123342). Column graph, calibration curve, and decision curve analyses also proved the accuracy of AMI prediction. We also constructed a mouse C57BL/6 model of AMI (3 h, 48 h, and 1 week) and used qRT-PCR and immunofluorescence to detect the expression changes of CBLB and ZNF302. In this study, we present a systematic analysis of the complex relationship between cuproptosis and a first AMI attack, and provide new insights into the diagnosis and treatment of AMI.


Subject(s)
Computational Biology , Disease Models, Animal , Myocardial Infarction , Myocardial Infarction/genetics , Animals , Mice , Computational Biology/methods , Biomarkers/metabolism , Humans , Mice, Inbred C57BL , Gene Expression Profiling/methods , Male
7.
Oncol Res ; 32(6): 1093-1107, 2024.
Article in English | MEDLINE | ID: mdl-38827320

ABSTRACT

Breast cancer is the leading cause of cancer-related deaths in women worldwide, with Hormone Receptor (HR)+ being the predominant subtype. Tamoxifen (TAM) serves as the primary treatment for HR+ breast cancer. However, drug resistance often leads to recurrence, underscoring the need to develop new therapies to enhance patient quality of life and reduce recurrence rates. Artemisinin (ART) has demonstrated efficacy in inhibiting the growth of drug-resistant cells, positioning art as a viable option for counteracting endocrine resistance. This study explored the interaction between artemisinin and tamoxifen through a combined approach of bioinformatics analysis and experimental validation. Five characterized genes (ar, cdkn1a, erbb2, esr1, hsp90aa1) and seven drug-disease crossover genes (cyp2e1, rorc, mapk10, glp1r, egfr, pgr, mgll) were identified using WGCNA crossover analysis. Subsequent functional enrichment analyses were conducted. Our findings confirm a significant correlation between key cluster gene expression and immune cell infiltration in tamoxifen-resistant and -sensitized patients. scRNA-seq analysis revealed high expression of key cluster genes in epithelial cells, suggesting artemisinin's specific impact on tumor cells in estrogen receptor (ER)-positive BC tissues. Molecular target docking and in vitro experiments with artemisinin on LCC9 cells demonstrated a reversal effect in reducing migratory and drug resistance of drug-resistant cells by modulating relevant drug resistance genes. These results indicate that artemisinin could potentially reverse tamoxifen resistance in ER-positive breast cancer.


Subject(s)
Artemisinins , Breast Neoplasms , Computational Biology , Drug Resistance, Neoplasm , Receptors, Estrogen , Tamoxifen , Tamoxifen/pharmacology , Tamoxifen/therapeutic use , Humans , Artemisinins/pharmacology , Artemisinins/therapeutic use , Breast Neoplasms/drug therapy , Breast Neoplasms/genetics , Breast Neoplasms/metabolism , Breast Neoplasms/pathology , Female , Drug Resistance, Neoplasm/genetics , Computational Biology/methods , Receptors, Estrogen/metabolism , Antineoplastic Agents, Hormonal/pharmacology , Antineoplastic Agents, Hormonal/therapeutic use , Gene Expression Regulation, Neoplastic/drug effects , Cell Line, Tumor , Molecular Docking Simulation , Cell Proliferation/drug effects
8.
Nutrients ; 16(10)2024 May 20.
Article in English | MEDLINE | ID: mdl-38794775

ABSTRACT

BACKGROUND: This study aims to identify unique metabolomics biomarkers associated with Type 2 Diabetes (T2D) and develop an accurate diagnostics model using tree-based machine learning (ML) algorithms integrated with bioinformatics techniques. METHODS: Univariate and multivariate analyses such as fold change, a receiver operating characteristic curve (ROC), and Partial Least-Squares Discriminant Analysis (PLS-DA) were used to identify biomarker metabolites that showed significant concentration in T2D patients. Three tree-based algorithms [eXtreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), and Adaptive Boosting (AdaBoost)] that demonstrated robustness in high-dimensional data analysis were used to create a diagnostic model for T2D. RESULTS: As a result of the biomarker discovery process validated with three different approaches, Pyruvate, D-Rhamnose, AMP, pipecolate, Tetradecenoic acid, Tetradecanoic acid, Dodecanediothioic acid, Prostaglandin E3/D3 (isobars), ADP and Hexadecenoic acid were determined as potential biomarkers for T2D. Our results showed that the XGBoost model [accuracy = 0.831, F1-score = 0.845, sensitivity = 0.882, specificity = 0.774, positive predictive value (PPV) = 0.811, negative-PV (NPV) = 0.857 and Area under the ROC curve (AUC) = 0.887] had the slight highest performance measures. CONCLUSIONS: ML integrated with bioinformatics techniques offers accurate and positive T2D candidate biomarker discovery. The XGBoost model can successfully distinguish T2D based on metabolites.


Subject(s)
Biomarkers , Computational Biology , Diabetes Mellitus, Type 2 , Machine Learning , Metabolomics , Diabetes Mellitus, Type 2/metabolism , Humans , Biomarkers/blood , Computational Biology/methods , Pilot Projects , Male , Middle Aged , Female , Metabolomics/methods , ROC Curve , Algorithms , Aged , Adult
9.
Nat Commun ; 15(1): 4476, 2024 May 25.
Article in English | MEDLINE | ID: mdl-38796523

ABSTRACT

Protein functions are characterized by interactions with proteins, drugs, and other biomolecules. Understanding these interactions is essential for deciphering the molecular mechanisms underlying biological processes and developing new therapeutic strategies. Current computational methods mostly predict interactions based on either molecular network or structural information, without integrating them within a unified multi-scale framework. While a few multi-view learning methods are devoted to fusing the multi-scale information, these methods tend to rely intensively on a single scale and under-fitting the others, likely attributed to the imbalanced nature and inherent greediness of multi-scale learning. To alleviate the optimization imbalance, we present MUSE, a multi-scale representation learning framework based on a variant expectation maximization to optimize different scales in an alternating procedure over multiple iterations. This strategy efficiently fuses multi-scale information between atomic structure and molecular network scale through mutual supervision and iterative optimization. MUSE outperforms the current state-of-the-art models not only in molecular interaction (protein-protein, drug-protein, and drug-drug) tasks but also in protein interface prediction at the atomic structure scale. More importantly, the multi-scale learning framework shows potential for extension to other scales of computational drug discovery.


Subject(s)
Computational Biology , Proteins , Proteins/chemistry , Proteins/metabolism , Computational Biology/methods , Algorithms , Pharmaceutical Preparations/chemistry , Pharmaceutical Preparations/metabolism , Machine Learning , Drug Interactions , Humans , Protein Binding
10.
Methods Mol Biol ; 2726: 45-83, 2024.
Article in English | MEDLINE | ID: mdl-38780727

ABSTRACT

Several different ways to predict RNA secondary structures have been suggested in the literature. Statistical methods, such as those that utilize stochastic context-free grammars (SCFGs), or approaches based on machine learning aim to predict the best representative structure for the underlying ensemble of possible conformations. Their parameters have therefore been trained on larger subsets of well-curated, known secondary structures. Physics-based methods, on the other hand, usually refrain from using optimized parameters. They model secondary structures from loops as individual building blocks which have been assigned a physical property instead: the free energy of the respective loop. Such free energies are either derived from experiments or from mathematical modeling. This rigorous use of physical properties then allows for the application of statistical mechanics to describe the entire state space of RNA secondary structures in terms of equilibrium probabilities. On that basis, and by using efficient algorithms, many more descriptors of the conformational state space of RNA molecules can be derived to investigate and explain the many functions of RNA molecules. Moreover, compared to other methods, physics-based models allow for a much easier extension with other properties that can be measured experimentally. For instance, small molecules or proteins can bind to an RNA and their binding affinity can be assessed experimentally. Under certain conditions, existing RNA secondary structure prediction tools can be used to model this RNA-ligand binding and to eventually shed light on its impact on structure formation and function.


Subject(s)
Nucleic Acid Conformation , RNA , Thermodynamics , RNA/chemistry , Algorithms , Computational Biology/methods , Machine Learning , Models, Molecular
11.
Methods Mol Biol ; 2726: 125-141, 2024.
Article in English | MEDLINE | ID: mdl-38780730

ABSTRACT

Analysis of the folding space of RNA generally suffers from its exponential size. With classified Dynamic Programming algorithms, it is possible to alleviate this burden and to analyse the folding space of RNA in great depth. Key to classified DP is that the search space is partitioned into classes based on an on-the-fly computed feature. A class-wise evaluation is then used to compute class-wide properties, such as the lowest free energy structure for each class, or aggregate properties, such as the class' probability. In this paper we describe the well-known shape and hishape abstraction of RNA structures, their power to help better understand RNA function and related methods that are based on these abstractions.


Subject(s)
Algorithms , Computational Biology , Nucleic Acid Conformation , RNA Folding , RNA , RNA/chemistry , RNA/genetics , Computational Biology/methods , Software , Thermodynamics
12.
Methods Mol Biol ; 2726: 143-168, 2024.
Article in English | MEDLINE | ID: mdl-38780731

ABSTRACT

The 3D structures of many ribonucleic acid (RNA) loops are characterized by highly organized networks of non-canonical interactions. Multiple computational methods have been developed to annotate structures with those interactions or automatically identify recurrent interaction networks. By contrast, the reverse problem that aims to retrieve the geometry of a look from its sequence or ensemble of interactions remains much less explored. In this chapter, we will describe how to retrieve and build families of conserved structural motifs using their underlying network of non-canonical interactions. Then, we will show how to assign sequence alignments to those families and use the software BayesPairing to build statistical models of structural motifs with their associated sequence alignments. From this model, we will apply BayesPairing to identify in new sequences regions where those loop geometries can occur.


Subject(s)
Base Pairing , Computational Biology , RNA , Software , Computational Biology/methods , RNA/chemistry , RNA/genetics , Nucleic Acid Conformation , Sequence Alignment/methods , Algorithms , Nucleotide Motifs , Bayes Theorem , Models, Molecular
13.
Methods Mol Biol ; 2726: 235-254, 2024.
Article in English | MEDLINE | ID: mdl-38780734

ABSTRACT

Generating accurate alignments of non-coding RNA sequences is indispensable in the quest for understanding RNA function. Nevertheless, aligning RNAs remains a challenging computational task. In the twilight-zone of RNA sequences with low sequence similarity, sequence homologies and compatible, favorable (a priori unknown) structures can be inferred only in dependency of each other. Thus, simultaneous alignment and folding (SA&F) remains the gold-standard of comparative RNA analysis, even if this method is computationally highly demanding. This text introduces to the recent release 2.0 of the software package LocARNA, focusing on its practical application. The package enables versatile, fast and accurate analysis of multiple RNAs. For this purpose, it implements SA&F algorithms in a specific, lightweight flavor that makes them routinely applicable in large scale. Its high performance is achieved by combining ensemble-based sparsification of the structure space and banding strategies. Probabilistic banding strongly improves the performance of LocARNA 2.0 even over previous releases, while simplifying its effective use. Enabling flexible application to various use cases, LocARNA provides tools to globally and locally compare, cluster, and multiply aligned RNAs based on optimization and probabilistic variants of SA&F, which optionally integrate prior knowledge, expressible by anchor and structure constraints.


Subject(s)
Algorithms , Computational Biology , RNA Folding , RNA , Software , RNA/genetics , RNA/chemistry , Computational Biology/methods , Nucleic Acid Conformation , Sequence Alignment/methods , Sequence Analysis, RNA/methods
14.
Methods Mol Biol ; 2726: 85-104, 2024.
Article in English | MEDLINE | ID: mdl-38780728

ABSTRACT

The structure of RNA molecules and their complexes are crucial for understanding biology at the molecular level. Resolving these structures holds the key to understanding their manifold structure-mediated functions ranging from regulating gene expression to catalyzing biochemical processes. Predicting RNA secondary structure is a prerequisite and a key step to accurately model their three dimensional structure. Although dedicated modelling software are making fast and significant progresses, predicting an accurate secondary structure from the sequence remains a challenge. Their performance can be significantly improved by the incorporation of experimental RNA structure probing data. Many different chemical and enzymatic probes have been developed; however, only one set of quantitative data can be incorporated as constraints for computer-assisted modelling. IPANEMAP is a recent workflow based on RNAfold that can take into account several quantitative or qualitative data sets to model RNA secondary structure. This chapter details the methods for popular chemical probing (DMS, CMCT, SHAPE-CE, and SHAPE-Map) and the subsequent analysis and structure prediction using IPANEMAP.


Subject(s)
Models, Molecular , Nucleic Acid Conformation , RNA , Software , Workflow , RNA/chemistry , RNA/genetics , Computational Biology/methods
15.
Methods Mol Biol ; 2726: 169-207, 2024.
Article in English | MEDLINE | ID: mdl-38780732

ABSTRACT

Nucleotide modifications are occurrent in all types of RNA and play an important role in RNA structure formation and stability. Modified bases not only possess the ability to shift the RNA structure ensemble towards desired functional confirmations. By changes in the base pairing partner preference, they may even enlarge or reduce the conformational space, i.e., the number and types of structures the RNA molecule can adopt. However, most methods to predict RNA secondary structure do not provide the means to include the effect of modifications on the result. With the help of a heavily modified transfer RNA (tRNA) molecule, this chapter demonstrates how to include the effect of different base modifications into secondary structure prediction using the ViennaRNA Package. The constructive approach demonstrated here allows for the calculation of minimum free energy structure and suboptimal structures at different levels of modified base support. In particular we, show how to incorporate the isomerization of uridine to pseudouridine ( Ψ ) and the reduction of uridine to dihydrouridine (D).


Subject(s)
Nucleic Acid Conformation , RNA , RNA/chemistry , RNA, Transfer/chemistry , RNA, Transfer/metabolism , Nucleotides/chemistry , Base Pairing , Computational Biology/methods , Thermodynamics , Software , Uridine/chemistry , Models, Molecular , Pseudouridine/chemistry
16.
Methods Mol Biol ; 2726: 105-124, 2024.
Article in English | MEDLINE | ID: mdl-38780729

ABSTRACT

The structure of an RNA sequence encodes information about its biological function. Dynamic programming algorithms are often used to predict the conformation of an RNA molecule from its sequence alone, and adding experimental data as auxiliary information improves prediction accuracy. This auxiliary data is typically incorporated into the nearest neighbor thermodynamic model22 by converting the data into pseudoenergies. Here, we look at how much of the space of possible structures auxiliary data allows prediction methods to explore. We find that for a large class of RNA sequences, auxiliary data shifts the predictions significantly. Additionally, we find that predictions are highly sensitive to the parameters which define the auxiliary data pseudoenergies. In fact, the parameter space can typically be partitioned into regions where different structural predictions predominate.


Subject(s)
Algorithms , Computational Biology , Nucleic Acid Conformation , RNA , Thermodynamics , RNA/chemistry , RNA/genetics , Computational Biology/methods , Software
17.
Methods Mol Biol ; 2726: 209-234, 2024.
Article in English | MEDLINE | ID: mdl-38780733

ABSTRACT

Computational prediction of RNA-RNA interactions (RRI) is a central methodology for the specific investigation of inter-molecular RNA interactions and regulatory effects of non-coding RNAs like eukaryotic microRNAs or prokaryotic small RNAs. Available methods can be classified according to their underlying prediction strategies, each implicating specific capabilities and restrictions often not transparent to the non-expert user. Within this work, we review seven classes of RRI prediction strategies and discuss the advantages and limitations of respective tools, since such knowledge is essential for selecting the right tool in the first place.Among the RRI prediction strategies, accessibility-based approaches have been shown to provide the most reliable predictions. Here, we describe how IntaRNA, as one of the state-of-the-art accessibility-based tools, can be applied in various use cases for the task of computational RRI prediction. Detailed hands-on examples for individual RRI predictions as well as large-scale target prediction scenarios are provided. We illustrate the flexibility and capabilities of IntaRNA through the examples. Each example is designed using real-life data from the literature and is accompanied by instructions on interpreting the respective results from IntaRNA output. Our use-case driven instructions enable non-expert users to comprehensively understand and utilize IntaRNA's features for effective RRI predictions.


Subject(s)
Computational Biology , Software , Computational Biology/methods , RNA/genetics , RNA/metabolism , Algorithms , Humans , MicroRNAs/genetics , MicroRNAs/metabolism
18.
Methods Mol Biol ; 2726: 255-284, 2024.
Article in English | MEDLINE | ID: mdl-38780735

ABSTRACT

Effective homology search for non-coding RNAs is frequently not possible via sequence similarity alone. Current methods leverage evolutionary information like structure conservation or covariance scores to identify homologs in organisms that are phylogenetically more distant. In this chapter, we introduce the theoretical background of evolutionary structure conservation and covariance score, and we show hands-on how current methods in the field are applied on example datasets.


Subject(s)
Computational Biology , Evolution, Molecular , Computational Biology/methods , Phylogeny , Algorithms , RNA, Untranslated/genetics , Conserved Sequence , Humans , Animals , Software , Sequence Alignment/methods
19.
Methods Mol Biol ; 2726: 285-313, 2024.
Article in English | MEDLINE | ID: mdl-38780736

ABSTRACT

Applications in biotechnology and bio-medical research call for effective strategies to design novel RNAs with very specific properties. Such advanced design tasks require support by computational tools but at the same time put high demands on their flexibility and expressivity to model the application-specific requirements. To address such demands, we present the computational framework Infrared. It supports developing advanced customized design tools, which generate RNA sequences with specific properties, often in a few lines of Python code. This text guides the reader in tutorial format through the development of complex design applications. Thanks to the declarative, compositional approach of Infrared, we can describe this development as a step-by-step extension of an elementary design task. Thus, we start with generating sequences that are compatible with a single RNA structure and go all the way to RNA design targeting complex positive and negative design objectives with respect to single or even multiple target structures. Finally, we present a "real-world" application of computational design to create an RNA device for biotechnology: we use Infrared to generate design candidates of an artificial "AND" riboswitch, which activates gene expression in the simultaneous presence of two different small metabolites. In these applications, we exploit that the system can generate, in an efficient (fixed-parameter tractable) way, multiple diverse designs that satisfy a number of constraints and have high quality w.r.t. to an objective (by sampling from a Boltzmann distribution).


Subject(s)
Computational Biology , Nucleic Acid Conformation , RNA , Software , RNA/genetics , RNA/chemistry , Computational Biology/methods , Riboswitch/genetics , Biotechnology/methods
20.
Methods Mol Biol ; 2726: 315-346, 2024.
Article in English | MEDLINE | ID: mdl-38780737

ABSTRACT

Although RNA molecules are synthesized via transcription, little is known about the general impact of cotranscriptional folding in vivo. We present different computational approaches for the simulation of changing structure ensembles during transcription, including interpretations with respect to experimental data from literature. Specifically, we analyze different mutations of the E. coli SRP RNA, which has been studied comparatively well in previous literature, yet the details of which specific metastable structures form as well as when they form are still under debate. Here, we combine thermodynamic and kinetic, deterministic, and stochastic models with automated and visual inspection of those systems to derive the most likely scenario of which substructures form at which point during transcription. The simulations do not only provide explanations for present experimental observations but also suggest previously unnoticed conformations that may be verified through future experimental studies.


Subject(s)
Escherichia coli , Nucleic Acid Conformation , RNA Folding , RNA, Bacterial , Thermodynamics , Transcription, Genetic , RNA, Bacterial/chemistry , RNA, Bacterial/genetics , Escherichia coli/genetics , Escherichia coli/metabolism , Signal Recognition Particle/chemistry , Signal Recognition Particle/metabolism , Signal Recognition Particle/genetics , Kinetics , Computational Biology/methods , Mutation , Models, Molecular
SELECTION OF CITATIONS
SEARCH DETAIL
...