Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 6.346
Filter
1.
Curr Protoc ; 4(5): e1054, 2024 May.
Article in English | MEDLINE | ID: mdl-38808970

ABSTRACT

RNA sequencing (RNA-seq) has emerged as a powerful tool for assessing genome-wide gene expression, revolutionizing various fields of biology. However, analyzing large RNA-seq datasets can be challenging, especially for students or researchers lacking bioinformatics experience. To address these challenges, we present a comprehensive guide to provide step-by-step workflows for analyzing RNA-seq data, from raw reads to functional enrichment analysis, starting with considerations for experimental design. This is designed to aid students and researchers working with any organism, irrespective of whether an assembled genome is available. Within this guide, we employ various recognized bioinformatics tools to navigate the landscape of RNA-seq analysis and discuss the advantages and disadvantages of different tools for the same task. Our protocol focuses on clarity, reproducibility, and practicality to enable users to navigate the complexities of RNA-seq data analysis easily and gain valuable biological insights from the datasets. Additionally, all scripts and a sample dataset are available in a GitHub repository to facilitate the implementation of the analysis pipeline. © 2024 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Analysis of data from a model plant with an available reference genome Basic Protocol 2: Gene ontology enrichment analysis Basic Protocol 3: De novo assembly of data from non-model plants.


Subject(s)
RNA-Seq , RNA-Seq/methods , Computational Biology/methods , Sequence Analysis, RNA/methods , Software
2.
Theor Appl Genet ; 137(6): 143, 2024 May 27.
Article in English | MEDLINE | ID: mdl-38801535

ABSTRACT

KEY MESSAGE: Association analysis, colocation study with previously reported QTL, and differential expression analyses allowed the identification of the consistent QTLs and main candidate genes controlling seed traits. Common beans show wide seed variations in shape, size, water uptake, and coat proportion. This study aimed to identify consistent genomic regions and candidate genes involved in the genetic control of seed traits by combining association and differential expression analyses. In total, 298 lines from the Spanish Diversity Panel were genotyped with 4,658 SNP and phenotyped for seven seed traits in three seasons. Thirty-eight significant SNP-trait associations were detected, which were grouped into 23 QTL genomic regions with 1,605 predicted genes. The positions of the five QTL regions associated with seed weight were consistent with previously reported QTL. HCPC analysis using the SNP that tagged these five QTL regions revealed three main clusters with significantly different seed weights. This analysis also separated groups that corresponded well with the two gene pools described: Andean and Mesoamerican. Expression analysis was performed on the seeds of the cultivar 'Xana' in three seed development stages, and 1,992 differentially expressed genes (DEGs) were detected, mainly when comparing the early and late seed development stages (1,934 DEGs). Overall, 91 DEGs related to cell growth, signaling pathways, and transcriptomic factors underlying these 23 QTL were identified. Twenty-two DEGs were located in the five QTL regions associated with seed weight, suggesting that they are the main set of candidate genes controlling this character. The results confirmed that seed weight is the sum of the effects of a complex network of loci, and contributed to the understanding of seed phenotype control.


Subject(s)
Phaseolus , Phenotype , Polymorphism, Single Nucleotide , Quantitative Trait Loci , Seeds , Seeds/genetics , Seeds/growth & development , Phaseolus/genetics , Phaseolus/growth & development , Genotype , RNA-Seq , Genetic Association Studies , Genes, Plant , Chromosome Mapping , Gene Expression Regulation, Plant , Genome-Wide Association Study
3.
Allergy ; 79(6): 1584-1597, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38817208

ABSTRACT

BACKGROUND: Efforts to profile atopic dermatitis (AD) tissues have intensified, yet comprehensive analysis of systemic immune landscapes in severe AD remains crucial. METHODS: Employing single-cell RNA sequencing, we analyzed over 300,000 peripheral blood mononuclear cells from 12 severe AD patients (Eczema area and severity index (EASI) > 21) and six healthy controls. RESULTS: Results revealed significant immune cell shifts in AD patients, including increased Th2 cell abundance, reduced NK cell clusters with compromised cytotoxicity, and correlated Type 2 innate lymphoid cell proportions with disease severity. Moreover, unique monocyte clusters reflecting activated innate immunity emerged in very severe AD (EASI > 30). While overall dendritic cells (DCs) counts decreased, a distinct Th2-priming subset termed "Th2_DC" correlated strongly with disease severity, validated across skin tissue data, and flow cytometry with additional independent severe AD samples. Beyond the recognized role of Th2 adaptive immunity, our findings highlight significant innate immune cell alterations in severe AD, implicating their roles in disease pathogenesis and therapeutic potentials. CONCLUSION: Apart from the widely recognized role of Th2 adaptive immunity in AD pathogenesis, alterations in innate immune cells and impaired cytotoxic cells have also been observed in severe AD. The impact of these alterations on disease pathogenesis and the effectiveness of potential therapeutic targets requires further investigation.


Subject(s)
Dermatitis, Atopic , RNA-Seq , Severity of Illness Index , Single-Cell Analysis , Dermatitis, Atopic/immunology , Humans , Immunity, Innate , Male , Th2 Cells/immunology , Th2 Cells/metabolism , Female , Adult , Dendritic Cells/immunology , Dendritic Cells/metabolism , Leukocytes, Mononuclear/immunology , Leukocytes, Mononuclear/metabolism , Killer Cells, Natural/immunology , Killer Cells, Natural/metabolism , Case-Control Studies , Single-Cell Gene Expression Analysis
4.
Front Immunol ; 15: 1397541, 2024.
Article in English | MEDLINE | ID: mdl-38774870

ABSTRACT

Aim: Despite the significant therapeutic outcomes achieved in systemic treatments for liver hepatocellular carcinoma (LIHC), it is an objective reality that only a low proportion of patients exhibit an improved objective response rate (ORR) to current immunotherapies. Antibody-dependent cellular phagocytosis (ADCP) immunotherapy is considered the new engine for precision immunotherapy. Based on this, we aim to develop an ADCP-based LIHC risk stratification system and screen for relevant targets. Method: Utilizing a combination of single-cell RNA sequencing (scRNA-seq) and bulk RNA-seq data, we screened for ADCP modulating factors in LIHC and identified differentially expressed genes along with their involved functional pathways. A risk scoring model was established by identifying ADCP-related genes with prognostic value through LASSO Cox regression analysis. The risk scoring model was then subjected to evaluations of immune infiltration and immunotherapy relevance, with pan-cancer analysis and in vitro experimental studies conducted on key targets. Results: Building on the research by Kamber RA et al., we identified GYPA, CLDN18, and IRX5 as potential key target genes regulating ADCP in LIHC. These genes demonstrated significant correlations with immune infiltration cells, such as M1-type macrophages, and the effectiveness of immunotherapy in LIHC, as well as a close association with clinical pathological staging and patient prognosis. Pan-cancer analysis revealed that CLDN18 was prognostically and immunologically relevant across multiple types of cancer. Validation through tissue and cell samples confirmed that GYPA and CLDN18 were upregulated in liver cancer tissues and cells. Furthermore, in vitro knockdown of CLDN18 inhibited the malignancy capabilities of liver cancer cells. Conclusion: We have identified an ADCP signature in LIHC comprising three genes. Analysis based on a risk scoring model derived from these three genes, coupled with subsequent experimental validation, confirmed the pivotal role of M1-type macrophages in ADCP within LIHC, establishing CLDN18 as a critical ADCP regulatory target in LIHC.


Subject(s)
Carcinoma, Hepatocellular , Liver Neoplasms , RNA-Seq , Humans , Liver Neoplasms/genetics , Liver Neoplasms/immunology , Liver Neoplasms/therapy , Carcinoma, Hepatocellular/genetics , Carcinoma, Hepatocellular/immunology , Carcinoma, Hepatocellular/therapy , Prognosis , Immunotherapy/methods , Gene Expression Regulation, Neoplastic , Biomarkers, Tumor/genetics , Single-Cell Analysis , Phagocytosis/genetics , Tumor Microenvironment/immunology , Tumor Microenvironment/genetics , Gene Expression Profiling , Male , Claudins/genetics , Female , Single-Cell Gene Expression Analysis
5.
Exp Dermatol ; 33(5): e15077, 2024 May.
Article in English | MEDLINE | ID: mdl-38711200

ABSTRACT

Modelling atopic dermatitis (AD) in vitro is paramount to understand the disease pathophysiology and identify novel treatments. Previous studies have shown that the Th2 cytokines IL-4 and IL-13 induce AD-like features in keratinocytes in vitro. However, it has not been systematically researched whether the addition of Th2 cells, their supernatants or a 3D structure is superior to model AD compared to simple 2D cell culture with cytokines. For the first time, we investigated what in vitro option most closely resembles the disease in vivo based on single-cell RNA sequencing data (scRNA-seq) obtained from skin biopsies in a clinical study and published datasets of healthy and AD donors. In vitro models were generated with primary fibroblasts and keratinocytes, subjected to cytokine treatment or Th2 cell cocultures in 2D/3D. Gene expression changes were assessed using qPCR and Multiplex Immunoassays. Of all cytokines tested, incubation of keratinocytes and fibroblasts with IL-4 and IL-13 induced the closest in vivo-like AD phenotype which was observed in the scRNA-seq data. Addition of Th2 cells to fibroblasts failed to model AD due to the downregulation of ECM-associated genes such as POSTN. While keratinocytes cultured in 3D showed better stratification than in 2D, changes induced with AD triggers did not better resemble AD keratinocyte subtypes observed in vivo. Taken together, our comprehensive study shows that the simple model using IL-4 or IL-13 in 2D most accurately models AD in fibroblasts and keratinocytes in vitro, which may aid the discovery of novel treatment options.


Subject(s)
Dermatitis, Atopic , Fibroblasts , Interleukin-13 , Interleukin-4 , Keratinocytes , Sequence Analysis, RNA , Single-Cell Analysis , Th2 Cells , Humans , Fibroblasts/metabolism , Interleukin-4/pharmacology , Interleukin-4/metabolism , Interleukin-13/metabolism , Interleukin-13/pharmacology , Cytokines/metabolism , Coculture Techniques , RNA-Seq , Cells, Cultured , Skin/pathology
6.
BMC Cancer ; 24(1): 607, 2024 May 20.
Article in English | MEDLINE | ID: mdl-38769480

ABSTRACT

BACKGROUND: Cancerous cells' identity is determined via a mixture of multiple factors such as genomic variations, epigenetics, and the regulatory variations that are involved in transcription. The differences in transcriptome expression as well as abnormal structures in peptides determine phenotypical differences. Thus, bulk RNA-seq and more recent single-cell RNA-seq data (scRNA-seq) are important to identify pathogenic differences. In this case, we rely on k-mer decomposition of sequences to identify pathogenic variations in detail which does not need a reference, so it outperforms more traditional Next-Generation Sequencing (NGS) analysis techniques depending on the alignment of the sequences to a reference. RESULTS: Via our alignment-free analysis, over esophageal and glioblastoma cancer patients, high-frequency variations over multiple different locations (repeats, intergenic regions, exons, introns) as well as multiple different forms (fusion, polyadenylation, splicing, etc.) could be discovered. Additionally, we have analyzed the importance of less-focused events systematically in a classic transcriptome analysis pipeline where these events are considered as indicators for tumor prognosis, tumor prediction, tumor neoantigen inference, as well as their connection with respect to the immune microenvironment. CONCLUSIONS: Our results suggest that esophageal cancer (ESCA) and glioblastoma processes can be explained via pathogenic microbial RNA, repeated sequences, novel splicing variants, and long intergenic non-coding RNAs (lincRNAs). We expect our application of reference-free process and analysis to be helpful in tumor and normal samples differential scRNA-seq analysis, which in turn offers a more comprehensive scheme for major cancer-associated events.


Subject(s)
Glioblastoma , Single-Cell Analysis , Transcriptome , Humans , Single-Cell Analysis/methods , Glioblastoma/genetics , Glioblastoma/pathology , Gene Expression Profiling/methods , Esophageal Neoplasms/genetics , Esophageal Neoplasms/pathology , High-Throughput Nucleotide Sequencing , RNA-Seq/methods , Sequence Analysis, RNA/methods , Gene Expression Regulation, Neoplastic , Neoplasms/genetics , Neoplasms/pathology
7.
Brief Bioinform ; 25(3)2024 Mar 27.
Article in English | MEDLINE | ID: mdl-38770716

ABSTRACT

Temporal RNA-sequencing (RNA-seq) studies of bulk samples provide an opportunity for improved understanding of gene regulation during dynamic phenomena such as development, tumor progression or response to an incremental dose of a pharmacotherapeutic. Moreover, single-cell RNA-seq (scRNA-seq) data implicitly exhibit temporal characteristics because gene expression values recapitulate dynamic processes such as cellular transitions. Unfortunately, temporal RNA-seq data continue to be analyzed by methods that ignore this ordinal structure and yield results that are often difficult to interpret. Here, we present Error Modelled Gene Expression Analysis (EMOGEA), a framework for analyzing RNA-seq data that incorporates measurement uncertainty, while introducing a special formulation for those acquired to monitor dynamic phenomena. This method is specifically suited for RNA-seq studies in which low-count transcripts with small-fold changes lead to significant biological effects. Such transcripts include genes involved in signaling and non-coding RNAs that inherently exhibit low levels of expression. Using simulation studies, we show that this framework down-weights samples that exhibit extreme responses such as batch effects allowing them to be modeled with the rest of the samples and maintain the degrees of freedom originally envisioned for a study. Using temporal experimental data, we demonstrate the framework by extracting a cascade of gene expression waves from a well-designed RNA-seq study of zebrafish embryogenesis and an scRNA-seq study of mouse pre-implantation and provide unique biological insights into the regulation of genes in each wave. For non-ordinal measurements, we show that EMOGEA has a much higher rate of true positive calls and a vanishingly small rate of false negative discoveries compared to common approaches. Finally, we provide two packages in Python and R that are self-contained and easy to use, including test data.


Subject(s)
RNA-Seq , Zebrafish , Animals , Zebrafish/genetics , RNA-Seq/methods , Gene Expression Profiling/methods , Single-Cell Analysis/methods , Mice , Sequence Analysis, RNA/methods , Software
8.
Brief Bioinform ; 25(3)2024 Mar 27.
Article in English | MEDLINE | ID: mdl-38770720

ABSTRACT

The normalization of RNA sequencing data is a primary step for downstream analysis. The most popular method used for the normalization is the trimmed mean of M values (TMM) and DESeq. The TMM tries to trim away extreme log fold changes of the data to normalize the raw read counts based on the remaining non-deferentially expressed genes. However, the major problem with the TMM is that the values of trimming factor M are heuristic. This paper tries to estimate the adaptive value of M in TMM based on Jaeckel's Estimator, and each sample acts as a reference to find the scale factor of each sample. The presented approach is validated on SEQC, MAQC2, MAQC3, PICKRELL and two simulated datasets with two-group and three-group conditions by varying the percentage of differential expression and the number of replicates. The performance of the present approach is compared with various state-of-the-art methods, and it is better in terms of area under the receiver operating characteristic curve and differential expression.


Subject(s)
RNA-Seq , RNA-Seq/methods , Humans , Algorithms , Sequence Analysis, RNA/methods , Computational Biology/methods , Gene Expression Profiling/methods , ROC Curve , Software
9.
Nat Commun ; 15(1): 4055, 2024 May 14.
Article in English | MEDLINE | ID: mdl-38744843

ABSTRACT

We introduce GRouNdGAN, a gene regulatory network (GRN)-guided reference-based causal implicit generative model for simulating single-cell RNA-seq data, in silico perturbation experiments, and benchmarking GRN inference methods. Through the imposition of a user-defined GRN in its architecture, GRouNdGAN simulates steady-state and transient-state single-cell datasets where genes are causally expressed under the control of their regulating transcription factors (TFs). Training on six experimental reference datasets, we show that our model captures non-linear TF-gene dependencies and preserves gene identities, cell trajectories, pseudo-time ordering, and technical and biological noise, with no user manipulation and only implicit parameterization. GRouNdGAN can synthesize cells under new conditions to perform in silico TF knockout experiments. Benchmarking various GRN inference algorithms reveals that GRouNdGAN effectively bridges the existing gap between simulated and biological data benchmarks of GRN inference algorithms, providing gold standard ground truth GRNs and realistic cells corresponding to the biological system of interest.


Subject(s)
Algorithms , Computer Simulation , Gene Regulatory Networks , RNA-Seq , Single-Cell Analysis , Single-Cell Analysis/methods , RNA-Seq/methods , Humans , Transcription Factors/metabolism , Transcription Factors/genetics , Computational Biology/methods , Benchmarking , Sequence Analysis, RNA/methods , Single-Cell Gene Expression Analysis
10.
Brief Bioinform ; 25(3)2024 Mar 27.
Article in English | MEDLINE | ID: mdl-38739758

ABSTRACT

The complicated process of neuronal development is initiated early in life, with the genetic mechanisms governing this process yet to be fully elucidated. Single-cell RNA sequencing (scRNA-seq) is a potent instrument for pinpointing biomarkers that exhibit differential expression across various cell types and developmental stages. By employing scRNA-seq on human embryonic stem cells, we aim to identify differentially expressed genes (DEGs) crucial for early-stage neuronal development. Our focus extends beyond simply identifying DEGs. We strive to investigate the functional roles of these genes through enrichment analysis and construct gene regulatory networks to understand their interactions. Ultimately, this comprehensive approach aspires to illuminate the molecular mechanisms and transcriptional dynamics governing early human brain development. By uncovering potential links between these DEGs and intelligence, mental disorders, and neurodevelopmental disorders, we hope to shed light on human neurological health and disease. In this study, we have used scRNA-seq to identify DEGs involved in early-stage neuronal development in hESCs. The scRNA-seq data, collected on days 26 (D26) and 54 (D54), of the in vitro differentiation of hESCs to neurons were analyzed. Our analysis identified 539 DEGs between D26 and D54. Functional enrichment of those DEG biomarkers indicated that the up-regulated DEGs participated in neurogenesis, while the down-regulated DEGs were linked to synapse regulation. The Reactome pathway analysis revealed that down-regulated DEGs were involved in the interactions between proteins located in synapse pathways. We also discovered interactions between DEGs and miRNA, transcriptional factors (TFs) and DEGs, and between TF and miRNA. Our study identified 20 significant transcription factors, shedding light on early brain development genetics. The identified DEGs and gene regulatory networks are valuable resources for future research into human brain development and neurodevelopmental disorders.


Subject(s)
Biomarkers , Brain , Gene Regulatory Networks , Human Embryonic Stem Cells , Single-Cell Analysis , Humans , Single-Cell Analysis/methods , Human Embryonic Stem Cells/metabolism , Human Embryonic Stem Cells/cytology , Brain/metabolism , Brain/embryology , Brain/cytology , Biomarkers/metabolism , Neurons/metabolism , Neurons/cytology , Cell Differentiation/genetics , RNA-Seq , Neurogenesis/genetics , Gene Expression Regulation, Developmental , Gene Expression Profiling , Sequence Analysis, RNA/methods , Single-Cell Gene Expression Analysis
11.
Methods Mol Biol ; 2808: 121-127, 2024.
Article in English | MEDLINE | ID: mdl-38743366

ABSTRACT

During the infection of a host cell by an infectious agent, a series of gene expression changes occurs as a consequence of host-pathogen interactions. Unraveling this complex interplay is the key for understanding of microbial virulence and host response pathways, thus providing the basis for new molecular insights into the mechanisms of pathogenesis and the corresponding immune response. Dual RNA sequencing (dual RNA-seq) has been developed to simultaneously determine pathogen and host transcriptomes enabling both differential and coexpression analyses between the two partners as well as genome characterization in the case of RNA viruses. Here, we provide a detailed laboratory protocol and bioinformatics analysis guidelines for dual RNA-seq experiments focusing on - but not restricted to - measles virus (MeV) as a pathogen of interest. The application of dual RNA-seq technologies in MeV-infected patients can potentially provide valuable information on the structure of the viral RNA genome and on cellular innate immune responses and drive the discovery of new targets for antiviral therapy.


Subject(s)
Genome, Viral , Host-Pathogen Interactions , Measles virus , Measles , RNA, Viral , Humans , Measles/virology , Measles/immunology , Measles/genetics , Measles virus/genetics , Measles virus/pathogenicity , RNA, Viral/genetics , Host-Pathogen Interactions/genetics , Host-Pathogen Interactions/immunology , Computational Biology/methods , Sequence Analysis, RNA/methods , RNA-Seq/methods , Transcriptome , Gene Expression Profiling/methods , High-Throughput Nucleotide Sequencing/methods
12.
Sci Rep ; 14(1): 10940, 2024 05 13.
Article in English | MEDLINE | ID: mdl-38740888

ABSTRACT

Improving the baking quality is a primary challenge in the wheat flour production value chain, as baking quality represents a crucial factor in determining its overall value. In the present study, we conducted a comparative RNA-Seq analysis on the high baking quality mutant "O-64.1.10" genotype and its low baking quality wild type "Omid" cultivar to recognize potential genes associated with bread quality. The cDNA libraries were constructed from immature grains that were 15 days post-anthesis, with an average of 16.24 and 18.97 million paired-end short-read sequences in the mutant and wild-type, respectively. A total number of 733 transcripts with differential expression were identified, 585 genes up-regulated and 188 genes down-regulated in the "O-64.1.10" genotype compared to the "Omid". In addition, the families of HSF, bZIP, C2C2-Dof, B3-ARF, BES1, C3H, GRF, HB-HD-ZIP, PLATZ, MADS-MIKC, GARP-G2-like, NAC, OFP and TUB were appeared as the key transcription factors with specific expression in the "O-64.1.10" genotype. At the same time, pathways related to baking quality were identified through Kyoto Encyclopedia of Genes and Genomes. Collectively, we found that the endoplasmic network, metabolic pathways, secondary metabolite biosynthesis, hormone signaling pathway, B group vitamins, protein pathways, pathways associated with carbohydrate and fat metabolism, as well as the biosynthesis and metabolism of various amino acids, have a great deal of potential to play a significant role in the baking quality. Ultimately, the RNA-seq results were confirmed using quantitative Reverse Transcription PCR for some hub genes such as alpha-gliadin, low molecular weight glutenin subunit and terpene synthase (gibberellin) and as a resource for future study, 127 EST-SSR primers were generated using RNA-seq data.


Subject(s)
Gene Expression Profiling , Gene Expression Regulation, Plant , RNA-Seq , Triticum , Triticum/genetics , Triticum/growth & development , Triticum/metabolism , RNA-Seq/methods , Gene Expression Profiling/methods , Transcriptome , Edible Grain/genetics , Edible Grain/metabolism , Cooking , Bread , Plant Proteins/genetics , Plant Proteins/metabolism , Genotype , Flour
13.
Sci Rep ; 14(1): 10873, 2024 05 13.
Article in English | MEDLINE | ID: mdl-38740918

ABSTRACT

In addition to presenting significant diagnostic and treatment challenges, lung adenocarcinoma (LUAD) is the most common form of lung cancer. Using scRNA-Seq and bulk RNA-Seq data, we identify three genes referred to as HMR, FAM83A, and KRT6A these genes are related to necroptotic anoikis-related gene expression. Initial validation, conducted on the GSE50081 dataset, demonstrated the model's ability to categorize LUAD patients into high-risk and low-risk groups with significant survival differences. This model was further applied to predict responses to PD-1/PD-L1 blockade therapies, utilizing the IMvigor210 and GSE78220 cohorts, and showed strong correlation with patient outcomes, highlighting its potential in personalized immunotherapy. Further, LUAD cell lines were analyzed using quantitative PCR (qPCR) and Western blot analysis to confirm their expression levels, further corroborating the model's relevance in LUAD pathophysiology. The mutation landscape of these genes was also explored, revealing their broad implication in various cancer types through a pan-cancer analysis. The study also delved into molecular subclustering, revealing distinct expression profiles and associations with different survival outcomes, emphasizing the model's utility in precision oncology. Moreover, the diversity of immune cell infiltration, analyzed in relation to the necroptotic anoikis signature, suggested significant implications for immune evasion mechanisms in LUAD. While the findings present a promising stride towards personalized LUAD treatment, especially in immunotherapy, limitations such as the retrospective nature of the datasets and the need for larger sample sizes are acknowledged. Prospective clinical trials and further experimental research are essential to validate these findings and enhance the clinical applicability of our prognostic model.


Subject(s)
Adenocarcinoma of Lung , Anoikis , B7-H1 Antigen , Immunotherapy , Lung Neoplasms , Programmed Cell Death 1 Receptor , RNA-Seq , Humans , Adenocarcinoma of Lung/genetics , Adenocarcinoma of Lung/immunology , Adenocarcinoma of Lung/drug therapy , Adenocarcinoma of Lung/pathology , Adenocarcinoma of Lung/mortality , Anoikis/genetics , Lung Neoplasms/genetics , Lung Neoplasms/drug therapy , Lung Neoplasms/pathology , Lung Neoplasms/immunology , Lung Neoplasms/mortality , Prognosis , Immunotherapy/methods , Programmed Cell Death 1 Receptor/genetics , Programmed Cell Death 1 Receptor/antagonists & inhibitors , B7-H1 Antigen/genetics , B7-H1 Antigen/metabolism , Single-Cell Analysis , Gene Expression Regulation, Neoplastic , Cell Line, Tumor , Immune Checkpoint Inhibitors/therapeutic use , Immune Checkpoint Inhibitors/pharmacology , Biomarkers, Tumor/genetics
14.
Brief Bioinform ; 25(3)2024 Mar 27.
Article in English | MEDLINE | ID: mdl-38706317

ABSTRACT

Single-cell RNA sequencing (scRNA-seq) enables the exploration of cellular heterogeneity by analyzing gene expression profiles in complex tissues. However, scRNA-seq data often suffer from technical noise, dropout events and sparsity, hindering downstream analyses. Although existing works attempt to mitigate these issues by utilizing graph structures for data denoising, they involve the risk of propagating noise and fall short of fully leveraging the inherent data relationships, relying mainly on one of cell-cell or gene-gene associations and graphs constructed by initial noisy data. To this end, this study presents single-cell bilevel feature propagation (scBFP), two-step graph-based feature propagation method. It initially imputes zero values using non-zero values, ensuring that the imputation process does not affect the non-zero values due to dropout. Subsequently, it denoises the entire dataset by leveraging gene-gene and cell-cell relationships in the respective steps. Extensive experimental results on scRNA-seq data demonstrate the effectiveness of scBFP in various downstream tasks, uncovering valuable biological insights.


Subject(s)
Sequence Analysis, RNA , Single-Cell Analysis , Single-Cell Analysis/methods , Sequence Analysis, RNA/methods , Humans , Algorithms , Gene Expression Profiling/methods , Computational Biology/methods , RNA-Seq/methods
15.
Brief Bioinform ; 25(3)2024 Mar 27.
Article in English | MEDLINE | ID: mdl-38701412

ABSTRACT

Trajectory inference is a crucial task in single-cell RNA-sequencing downstream analysis, which can reveal the dynamic processes of biological development, including cell differentiation. Dimensionality reduction is an important step in the trajectory inference process. However, most existing trajectory methods rely on cell features derived from traditional dimensionality reduction methods, such as principal component analysis and uniform manifold approximation and projection. These methods are not specifically designed for trajectory inference and fail to fully leverage prior information from upstream analysis, limiting their performance. Here, we introduce scCRT, a novel dimensionality reduction model for trajectory inference. In order to utilize prior information to learn accurate cells representation, scCRT integrates two feature learning components: a cell-level pairwise module and a cluster-level contrastive module. The cell-level module focuses on learning accurate cell representations in a reduced-dimensionality space while maintaining the cell-cell positional relationships in the original space. The cluster-level contrastive module uses prior cell state information to aggregate similar cells, preventing excessive dispersion in the low-dimensional space. Experimental findings from 54 real and 81 synthetic datasets, totaling 135 datasets, highlighted the superior performance of scCRT compared with commonly used trajectory inference methods. Additionally, an ablation study revealed that both cell-level and cluster-level modules enhance the model's ability to learn accurate cell features, facilitating cell lineage inference. The source code of scCRT is available at https://github.com/yuchen21-web/scCRT-for-scRNA-seq.


Subject(s)
Algorithms , Single-Cell Analysis , Single-Cell Analysis/methods , Humans , RNA-Seq/methods , Computational Biology/methods , Software , Sequence Analysis, RNA/methods , Animals , Single-Cell Gene Expression Analysis
16.
Brief Bioinform ; 25(3)2024 Mar 27.
Article in English | MEDLINE | ID: mdl-38701413

ABSTRACT

With the emergence of large amount of single-cell RNA sequencing (scRNA-seq) data, the exploration of computational methods has become critical in revealing biological mechanisms. Clustering is a representative for deciphering cellular heterogeneity embedded in scRNA-seq data. However, due to the diversity of datasets, none of the existing single-cell clustering methods shows overwhelming performance on all datasets. Weighted ensemble methods are proposed to integrate multiple results to improve heterogeneity analysis performance. These methods are usually weighted by considering the reliability of the base clustering results, ignoring the performance difference of the same base clustering on different cells. In this paper, we propose a high-order element-wise weighting strategy based self-representative ensemble learning framework: scEWE. By assigning different base clustering weights to individual cells, we construct and optimize the consensus matrix in a careful and exquisite way. In addition, we extracted the high-order information between cells, which enhanced the ability to represent the similarity relationship between cells. scEWE is experimentally shown to significantly outperform the state-of-the-art methods, which strongly demonstrates the effectiveness of the method and supports the potential applications in complex single-cell data analytical problems.


Subject(s)
Sequence Analysis, RNA , Single-Cell Analysis , Single-Cell Analysis/methods , Cluster Analysis , Sequence Analysis, RNA/methods , Algorithms , Computational Biology/methods , Humans , RNA-Seq/methods
17.
Science ; 384(6698): eadh1938, 2024 May 24.
Article in English | MEDLINE | ID: mdl-38781370

ABSTRACT

The molecular organization of the human neocortex historically has been studied in the context of its histological layers. However, emerging spatial transcriptomic technologies have enabled unbiased identification of transcriptionally defined spatial domains that move beyond classic cytoarchitecture. We used the Visium spatial gene expression platform to generate a data-driven molecular neuroanatomical atlas across the anterior-posterior axis of the human dorsolateral prefrontal cortex. Integration with paired single-nucleus RNA-sequencing data revealed distinct cell type compositions and cell-cell interactions across spatial domains. Using PsychENCODE and publicly available data, we mapped the enrichment of cell types and genes associated with neuropsychiatric disorders to discrete spatial domains.


Subject(s)
Single-Cell Analysis , Transcriptome , Humans , Dorsolateral Prefrontal Cortex/metabolism , Prefrontal Cortex/metabolism , Prefrontal Cortex/cytology , Prefrontal Cortex/physiology , Male , Female , Cell Communication , RNA-Seq , Gene Expression Profiling , Neurons/metabolism , Neurons/physiology , Adult , Sequence Analysis, RNA
18.
Science ; 384(6698): eadh2602, 2024 May 24.
Article in English | MEDLINE | ID: mdl-38781372

ABSTRACT

Genomic profiling in postmortem brain from autistic individuals has consistently revealed convergent molecular changes. What drives these changes and how they relate to genetic susceptibility in this complex condition are not well understood. We performed deep single-nucleus RNA sequencing (snRNA-seq) to examine cell composition and transcriptomics, identifying dysregulation of cell type-specific gene regulatory networks (GRNs) in autism spectrum disorder (ASD), which we corroborated using single-nucleus assay for transposase-accessible chromatin with sequencing (snATAC-seq) and spatial transcriptomics. Transcriptomic changes were primarily cell type specific, involving multiple cell types, most prominently interhemispheric and callosal-projecting neurons, interneurons within superficial laminae, and distinct glial reactive states involving oligodendrocytes, microglia, and astrocytes. Autism-associated GRN drivers and their targets were enriched in rare and common genetic risk variants, connecting autism genetic susceptibility and cellular and circuit alterations in the human brain.


Subject(s)
Autism Spectrum Disorder , Gene Regulatory Networks , Neurons , Single-Cell Analysis , Transcriptome , Humans , Autism Spectrum Disorder/genetics , Neurons/metabolism , Genetic Predisposition to Disease , Astrocytes/metabolism , Brain/metabolism , Genomics , Oligodendroglia/metabolism , Microglia/metabolism , RNA-Seq , Male , Interneurons/metabolism , Chromatin/metabolism , Female , Sequence Analysis, RNA
19.
Front Endocrinol (Lausanne) ; 15: 1382896, 2024.
Article in English | MEDLINE | ID: mdl-38800474

ABSTRACT

Background: Proliferative diabetic retinopathy (PDR), a major cause of blindness, is characterized by complex pathogenesis. This study integrates single-cell RNA sequencing (scRNA-seq), Non-negative Matrix Factorization (NMF), machine learning, and AlphaFold 2 methods to explore the molecular level of PDR. Methods: We analyzed scRNA-seq data from PDR patients and healthy controls to identify distinct cellular subtypes and gene expression patterns. NMF was used to define specific transcriptional programs in PDR. The oxidative stress-related genes (ORGs) identified within Meta-Program 1 were utilized to construct a predictive model using twelve machine learning algorithms. Furthermore, we employed AlphaFold 2 for the prediction of protein structures, complementing this with molecular docking to validate the structural foundation of potential therapeutic targets. We also analyzed protein-protein interaction (PPI) networks and the interplay among key ORGs. Results: Our scRNA-seq analysis revealed five major cell types and 14 subcell types in PDR patients, with significant differences in gene expression compared to those in controls. We identified three key meta-programs underscoring the role of microglia in the pathogenesis of PDR. Three critical ORGs (ALKBH1, PSIP1, and ATP13A2) were identified, with the best-performing predictive model demonstrating high accuracy (AUC of 0.989 in the training cohort and 0.833 in the validation cohort). Moreover, AlphaFold 2 predictions combined with molecular docking revealed that resveratrol has a strong affinity for ALKBH1, indicating its potential as a targeted therapeutic agent. PPI network analysis, revealed a complex network of interactions among the hub ORGs and other genes, suggesting a collective role in PDR pathogenesis. Conclusion: This study provides insights into the cellular and molecular aspects of PDR, identifying potential biomarkers and therapeutic targets using advanced technological approaches.


Subject(s)
Diabetic Retinopathy , Machine Learning , Humans , Diabetic Retinopathy/genetics , Diabetic Retinopathy/metabolism , Diabetic Retinopathy/pathology , Molecular Docking Simulation , Single-Cell Analysis/methods , Sequence Analysis, RNA/methods , RNA-Seq , Protein Interaction Maps , Female , Male , Oxidative Stress , Case-Control Studies , Single-Cell Gene Expression Analysis
20.
BMC Bioinformatics ; 25(1): 198, 2024 May 24.
Article in English | MEDLINE | ID: mdl-38789920

ABSTRACT

BACKGROUND: Single-cell transcriptome sequencing (scRNA-Seq) has allowed new types of investigations at unprecedented levels of resolution. Among the primary goals of scRNA-Seq is the classification of cells into distinct types. Many approaches build on existing clustering literature to develop tools specific to single-cell. However, almost all of these methods rely on heuristics or user-supplied parameters to control the number of clusters. This affects both the resolution of the clusters within the original dataset as well as their replicability across datasets. While many recommendations exist, in general, there is little assurance that any given set of parameters will represent an optimal choice in the trade-off between cluster resolution and replicability. For instance, another set of parameters may result in more clusters that are also more replicable. RESULTS: Here, we propose Dune, a new method for optimizing the trade-off between the resolution of the clusters and their replicability. Our method takes as input a set of clustering results-or partitions-on a single dataset and iteratively merges clusters within each partitions in order to maximize their concordance between partitions. As demonstrated on multiple datasets from different platforms, Dune outperforms existing techniques, that rely on hierarchical merging for reducing the number of clusters, in terms of replicability of the resultant merged clusters as well as concordance with ground truth. Dune is available as an R package on Bioconductor: https://www.bioconductor.org/packages/release/bioc/html/Dune.html . CONCLUSIONS: Cluster refinement by Dune helps improve the robustness of any clustering analysis and reduces the reliance on tuning parameters. This method provides an objective approach for borrowing information across multiple clusterings to generate replicable clusters most likely to represent common biological features across multiple datasets.


Subject(s)
RNA-Seq , Single-Cell Analysis , Software , Single-Cell Analysis/methods , RNA-Seq/methods , Cluster Analysis , Algorithms , Sequence Analysis, RNA/methods , Humans , Transcriptome/genetics , Reproducibility of Results , Gene Expression Profiling/methods , Single-Cell Gene Expression Analysis
SELECTION OF CITATIONS
SEARCH DETAIL
...