Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 6.362
Filtrar
1.
BMC Bioinformatics ; 25(1): 205, 2024 Jun 04.
Artigo em Inglês | MEDLINE | ID: mdl-38834962

RESUMO

BACKGROUND: Although RNA-seq data are traditionally used for quantifying gene expression levels, the same data could be useful in an integrated approach to compute genetic distances as well. Challenges to using mRNA sequences for computing genetic distances include the relatively high conservation of coding sequences and the presence of paralogous and, in some species, homeologous genes. RESULTS: We developed a new computational method, RNA-clique, for calculating genetic distances using assembled RNA-seq data and assessed the efficacy of the method using biological and simulated data. The method employs reciprocal BLASTn followed by graph-based filtering to ensure that only orthologous genes are compared. Each vertex in the graph constructed for filtering represents a gene in a specific sample under comparison, and an edge connects a pair of vertices if the genes they represent are best matches for each other in their respective samples. The distance computation is a function of the BLAST alignment statistics and the constructed graph and incorporates only those genes that are present in some complete connected component of this graph. As a biological testbed we used RNA-seq data of tall fescue (Lolium arundinaceum), an allohexaploid plant ( 2 n = 14 Gb ), and bluehead wrasse (Thalassoma bifasciatum), a teleost fish. RNA-clique reliably distinguished individual tall fescue plants by genotype and distinguished bluehead wrasse RNA-seq samples by individual. In tests with simulated RNA-seq data, the ground truth phylogeny was accurately recovered from the computed distances. Moreover, tests of the algorithm parameters indicated that, even with stringent filtering for orthologs, sufficient sequence data were retained for the distance computations. Although comparisons with an alternative method revealed that RNA-clique has relatively high time and memory requirements, the comparisons also showed that RNA-clique's results were at least as reliable as the alternative's for tall fescue data and were much more reliable for the bluehead wrasse data. CONCLUSION: Results of this work indicate that RNA-clique works well as a way of deriving genetic distances from RNA-seq data, thus providing a methodological integration of functional and genetic diversity studies.


Assuntos
RNA-Seq , RNA-Seq/métodos , Análise de Sequência de RNA/métodos , Biologia Computacional/métodos , Algoritmos
2.
PLoS One ; 19(6): e0293688, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38843139

RESUMO

It has been documented that variations in glycosylation on glycoprotein hormones, confer distinctly different biological features to the corresponding glycoforms when multiple in vitro biochemical readings are analyzed. We here applied next generation RNA sequencing to explore changes in the transcriptome of rat granulosa cells exposed for 0, 6, and 12 h to 100 ng/ml of four highly purified follicle-stimulating hormone (FSH) glycoforms, each exhibiting different glycosylation patterns: a. human pituitary FSH18/21 (hypo-glycosylated); b. human pituitary FSH24 (fully glycosylated); c. Equine FSH (eqFSH) (hypo-glycosylated); and d. Chinese-hamster ovary cell-derived human recombinant FSH (recFSH) (fully-glycosylated). Total RNA from triplicate incubations was prepared from FSH glycoform-exposed cultured granulosa cells obtained from DES-pretreated immature female rats, and RNA libraries were sequenced in a HighSeq 2500 sequencer (2 x 125 bp paired-end format, 10-15 x 106 reads/sample). The computational workflow focused on investigating differences among the four FSH glycoforms at three levels: gene expression, enriched biological processes, and perturbed pathways. Among the top 200 differentially expressed genes, only 4 (0.6%) were shared by all 4 glycoforms at 6 h, whereas 118 genes (40%) were shared at 12 h. Follicle-stimulating hormone glycocoforms stimulated different patterns of exclusive and associated up regulated biological processes in a glycoform and time-dependent fashion with more shared biological processes after 12 h of exposure and fewer treatment-specific ones, except for recFSH, which exhibited stronger responses with more specifically associated processes at this time. Similar results were found for down-regulated processes, with a greater number of processes at 6 h or 12 h, depending on the particular glycoform. In general, there were fewer downregulated than upregulated processes at both 6 h and 12 h, with FSH18/21 exhibiting the largest number of down-regulated associated processes at 6 h while eqFSH exhibited the greatest number at 12 h. Signaling cascades, largely linked to cAMP-PKA, MAPK, and PI3/AKT pathways were detected as differentially activated by the glycoforms, with each glycoform exhibiting its own molecular signature. These data extend previous observations demonstrating glycosylation-dependent distinctly different regulation of gene expression and intracellular signaling pathways triggered by FSH in granulosa cells. The results also suggest the importance of individual FSH glycoform glycosylation for the conformation of the ligand-receptor complex and induced signalling pathways.


Assuntos
Hormônio Foliculoestimulante , Células da Granulosa , Transcriptoma , Animais , Feminino , Células da Granulosa/metabolismo , Células da Granulosa/efeitos dos fármacos , Hormônio Foliculoestimulante/farmacologia , Hormônio Foliculoestimulante/metabolismo , Ratos , Glicosilação , Transcriptoma/efeitos dos fármacos , Humanos , Células Cultivadas , RNA-Seq/métodos , Células CHO , Cricetulus
3.
Nat Commun ; 15(1): 4710, 2024 Jun 06.
Artigo em Inglês | MEDLINE | ID: mdl-38844475

RESUMO

Alzheimer's Disease (AD) pathology has been increasingly explored through single-cell and single-nucleus RNA-sequencing (scRNA-seq & snRNA-seq) and spatial transcriptomics (ST). However, the surge in data demands a comprehensive, user-friendly repository. Addressing this, we introduce a single-cell and spatial RNA-seq database for Alzheimer's disease (ssREAD). It offers a broader spectrum of AD-related datasets, an optimized analytical pipeline, and improved usability. The database encompasses 1,053 samples (277 integrated datasets) from 67 AD-related scRNA-seq & snRNA-seq studies, totaling 7,332,202 cells. Additionally, it archives 381 ST datasets from 18 human and mouse brain studies. Each dataset is annotated with details such as species, gender, brain region, disease/control status, age, and AD Braak stages. ssREAD also provides an analysis suite for cell clustering, identification of differentially expressed and spatially variable genes, cell-type-specific marker genes and regulons, and spot deconvolution for integrative analysis. ssREAD is freely available at https://bmblx.bmi.osumc.edu/ssread/ .


Assuntos
Doença de Alzheimer , RNA-Seq , Análise de Célula Única , Doença de Alzheimer/genética , Humanos , Análise de Célula Única/métodos , Animais , Camundongos , RNA-Seq/métodos , Encéfalo/metabolismo , Encéfalo/patologia , Bases de Dados Genéticas , Transcriptoma , Análise de Sequência de RNA/métodos , Perfilação da Expressão Gênica/métodos , Masculino
4.
BMC Genomics ; 25(1): 554, 2024 Jun 03.
Artigo em Inglês | MEDLINE | ID: mdl-38831306

RESUMO

BACKGROUND: Sperm storage capacity (SSC) determines the duration of fertility in hens and is an important reproduction trait that cannot be ignored in production. Currently, the genetic mechanism of SSC is still unclear in hens. Therefore, to explore the genetic basis of SSC, we analyzed the uterus-vagina junction (UVJ) of hens with different SSC at different times after insemination by RNA-seq and Ribo-seq. RESULTS: Our results showed that 589, 596, and 527 differentially expressed genes (DEGs), 730, 783, and 324 differentially translated genes (DTGs), and 804, 625, and 467 differential translation efficiency genes (DTEGs) were detected on the 5th, 10th, and 15th days after insemination, respectively. In transcription levels, we found that the differences of SSC at different times after insemination were mainly reflected in the transmission of information between cells, the composition of intercellular adhesion complexes, the regulation of ion channels, the regulation of cellular physiological activities, the composition of cells, and the composition of cell membranes. In translation efficiency (TE) levels, the differences of SSC were mainly related to the physiological and metabolic activities in the cell, the composition of the organelle membrane, the physiological activities of oxidation, cell components, and cell growth processes. According to pathway analysis, SSC was related to neuroactive ligand-receptor interaction, histidine metabolism, and PPAR signaling pathway at the transcriptional level and glutathione metabolism, oxidative phosphorylation, calcium signaling pathway, cell adhesion molecules, galactose metabolism, and Wnt signaling pathway at the TE level. We screened candidate genes affecting SSC at transcriptional levels (COL4A4, MUC6, MCHR2, TACR1, AVPR1A, COL1A1, HK2, RB1, VIPR2, HMGCS2) and TE levels(COL4A4, MUC6, CYCS, NDUFA13, CYTB, RRM2, CAMK4, HRH2, LCT, GCK, GALT). Among them, COL4A4 and MUC6 were the key candidate genes differing in transcription, translation, and translation efficiency. CONCLUSIONS: Our study used the combined analysis of RNA-seq and Ribo-seq for the first time to investigate the SSC and reveal the physiological processes associated with SSC. The key candidate genes affecting SSC were screened, and the theoretical basis was provided for the analysis of the molecular regulation mechanism of SSC.


Assuntos
Galinhas , RNA-Seq , Espermatozoides , Animais , Galinhas/genética , Feminino , Masculino , Espermatozoides/metabolismo , Perfilação da Expressão Gênica , Inseminação , Transcriptoma , Análise de Sequência de RNA , Perfil de Ribossomos
5.
Genome Biol ; 25(1): 145, 2024 Jun 03.
Artigo em Inglês | MEDLINE | ID: mdl-38831386

RESUMO

BACKGROUND: Single-cell RNA sequencing (scRNA-seq) and spatially resolved transcriptomics (SRT) have led to groundbreaking advancements in life sciences. To develop bioinformatics tools for scRNA-seq and SRT data and perform unbiased benchmarks, data simulation has been widely adopted by providing explicit ground truth and generating customized datasets. However, the performance of simulation methods under multiple scenarios has not been comprehensively assessed, making it challenging to choose suitable methods without practical guidelines. RESULTS: We systematically evaluated 49 simulation methods developed for scRNA-seq and/or SRT data in terms of accuracy, functionality, scalability, and usability using 152 reference datasets derived from 24 platforms. SRTsim, scDesign3, ZINB-WaVE, and scDesign2 have the best accuracy performance across various platforms. Unexpectedly, some methods tailored to scRNA-seq data have potential compatibility for simulating SRT data. Lun, SPARSim, and scDesign3-tree outperform other methods under corresponding simulation scenarios. Phenopath, Lun, Simple, and MFA yield high scalability scores but they cannot generate realistic simulated data. Users should consider the trade-offs between method accuracy and scalability (or functionality) when making decisions. Additionally, execution errors are mainly caused by failed parameter estimations and appearance of missing or infinite values in calculations. We provide practical guidelines for method selection, a standard pipeline Simpipe ( https://github.com/duohongrui/simpipe ; https://doi.org/10.5281/zenodo.11178409 ), and an online tool Simsite ( https://www.ciblab.net/software/simshiny/ ) for data simulation. CONCLUSIONS: No method performs best on all criteria, thus a good-yet-not-the-best method is recommended if it solves problems effectively and reasonably. Our comprehensive work provides crucial insights for developers on modeling gene expression data and fosters the simulation process for users.


Assuntos
Perfilação da Expressão Gênica , Análise de Célula Única , Análise de Célula Única/métodos , Perfilação da Expressão Gênica/métodos , Humanos , Software , Simulação por Computador , Transcriptoma , Biologia Computacional/métodos , Análise de Sequência de RNA/métodos , RNA-Seq/métodos , RNA-Seq/normas
6.
PLoS One ; 19(6): e0297124, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38833485

RESUMO

In this research, a high-throughput RNA sequencing-based transcriptome analysis technique (RNA-Seq) was used to evaluate differentially expressed genes (DEGs) in the wild type Arabidopsis seedlings in response to AtPep1, a well-known peptide representing an endogenous damage-associated molecular pattern (DAMP), and flg22, a well-known microbe-associated molecular pattern (MAMP). We compared and dissected the global transcriptional landscape of Arabidopsis thaliana in response to AtPep1 and flg22 and could identify shared and unique DEGs in response to these elicitors. We found that while a remarkable number of flg22 up-regulated genes were also induced by AtPep1, 256 genes were exclusively up-regulated in response to flg22, and 328 were exclusively up-regulated in response to AtPep1. Furthermore, among down-regulated DEGs upon flg22 treatment, 107 genes were exclusively down-regulated by flg22 treatment, while 411 genes were exclusively down-regulated by AtPep1. We found a number of hitherto overlooked genes to be induced upon treatment with either flg22 or with AtPep1, indicating their possible involvement general pathways in innate immunity. Here, we characterized two of them, namely PP2-B13 and ACLP1. pp2-b13 and aclp1 mutants showed increased susceptibility to infection by the virulent pathogen Pseudomonas syringae DC3000 and its mutant Pst DC3000 hrcC (lacking the type III secretion system), as evidenced by increased proliferation of the two pathogens in planta. Further, we present evidence that the aclp1 mutant is deficient in ethylene production upon flg22 treatment, while the pp2-b13 mutant is deficient in the production of reactive oxygen species (ROS). The results from this research provide new information for a better understanding of the immune system in Arabidopsis.


Assuntos
Proteínas de Arabidopsis , Arabidopsis , Regulação da Expressão Gênica de Plantas , Arabidopsis/genética , Arabidopsis/imunologia , Arabidopsis/microbiologia , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , Imunidade Vegetal/genética , RNA-Seq/métodos , Pseudomonas syringae/patogenicidade , Perfilação da Expressão Gênica , Reconhecimento da Imunidade Inata
7.
J Exp Med ; 221(8)2024 Aug 05.
Artigo em Inglês | MEDLINE | ID: mdl-38847806

RESUMO

Due to bladder tumors' contact with urine, urine-derived cells (UDCs) may serve as a surrogate for monitoring the tumor microenvironment (TME) in bladder cancer (BC). However, the composition of UDCs and the extent to which they mirror the tumor remain poorly characterized. We generated the first single-cell RNA-sequencing of BC patient UDCs with matched tumor and peripheral blood mononuclear cells (PBMC). BC urine was more cellular than healthy donor (HD) urine, containing multiple immune populations including myeloid cells, CD4+ and CD8+ T cells, natural killer (NK) cells, B cells, and dendritic cells (DCs) in addition to tumor and stromal cells. Immune UDCs were transcriptionally more similar to tumor than blood. UDCs encompassed cytotoxic and activated CD4+ T cells, exhausted and tissue-resident memory CD8+ T cells, macrophages, germinal-center-like B cells, tissue-resident and adaptive NK cells, and regulatory DCs found in tumor but lacking or absent in blood. Our findings suggest BC UDCs may be surrogates for the TME and serve as therapeutic biomarkers.


Assuntos
Microambiente Tumoral , Neoplasias da Bexiga Urinária , Neoplasias da Bexiga Urinária/imunologia , Neoplasias da Bexiga Urinária/genética , Neoplasias da Bexiga Urinária/patologia , Humanos , Microambiente Tumoral/imunologia , Masculino , Células Matadoras Naturais/imunologia , Feminino , Linfócitos T CD8-Positivos/imunologia , Idoso , Linfócitos T CD4-Positivos/imunologia , Análise de Célula Única/métodos , Células Dendríticas/imunologia , Pessoa de Meia-Idade , Leucócitos Mononucleares/imunologia , Leucócitos Mononucleares/metabolismo , RNA-Seq , Análise da Expressão Gênica de Célula Única
8.
BMC Bioinformatics ; 25(1): 181, 2024 May 08.
Artigo em Inglês | MEDLINE | ID: mdl-38720247

RESUMO

BACKGROUND: RNA sequencing combined with machine learning techniques has provided a modern approach to the molecular classification of cancer. Class predictors, reflecting the disease class, can be constructed for known tissue types using the gene expression measurements extracted from cancer patients. One challenge of current cancer predictors is that they often have suboptimal performance estimates when integrating molecular datasets generated from different labs. Often, the quality of the data is variable, procured differently, and contains unwanted noise hampering the ability of a predictive model to extract useful information. Data preprocessing methods can be applied in attempts to reduce these systematic variations and harmonize the datasets before they are used to build a machine learning model for resolving tissue of origins. RESULTS: We aimed to investigate the impact of data preprocessing steps-focusing on normalization, batch effect correction, and data scaling-through trial and comparison. Our goal was to improve the cross-study predictions of tissue of origin for common cancers on large-scale RNA-Seq datasets derived from thousands of patients and over a dozen tumor types. The results showed that the choice of data preprocessing operations affected the performance of the associated classifier models constructed for tissue of origin predictions in cancer. CONCLUSION: By using TCGA as a training set and applying data preprocessing methods, we demonstrated that batch effect correction improved performance measured by weighted F1-score in resolving tissue of origin against an independent GTEx test dataset. On the other hand, the use of data preprocessing operations worsened classification performance when the independent test dataset was aggregated from separate studies in ICGC and GEO. Therefore, based on our findings with these publicly available large-scale RNA-Seq datasets, the application of data preprocessing techniques to a machine learning pipeline is not always appropriate.


Assuntos
Aprendizado de Máquina , Neoplasias , RNA-Seq , Humanos , RNA-Seq/métodos , Neoplasias/genética , Transcriptoma/genética , Análise de Sequência de RNA/métodos , Perfilação da Expressão Gênica/métodos , Biologia Computacional/métodos
9.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38701412

RESUMO

Trajectory inference is a crucial task in single-cell RNA-sequencing downstream analysis, which can reveal the dynamic processes of biological development, including cell differentiation. Dimensionality reduction is an important step in the trajectory inference process. However, most existing trajectory methods rely on cell features derived from traditional dimensionality reduction methods, such as principal component analysis and uniform manifold approximation and projection. These methods are not specifically designed for trajectory inference and fail to fully leverage prior information from upstream analysis, limiting their performance. Here, we introduce scCRT, a novel dimensionality reduction model for trajectory inference. In order to utilize prior information to learn accurate cells representation, scCRT integrates two feature learning components: a cell-level pairwise module and a cluster-level contrastive module. The cell-level module focuses on learning accurate cell representations in a reduced-dimensionality space while maintaining the cell-cell positional relationships in the original space. The cluster-level contrastive module uses prior cell state information to aggregate similar cells, preventing excessive dispersion in the low-dimensional space. Experimental findings from 54 real and 81 synthetic datasets, totaling 135 datasets, highlighted the superior performance of scCRT compared with commonly used trajectory inference methods. Additionally, an ablation study revealed that both cell-level and cluster-level modules enhance the model's ability to learn accurate cell features, facilitating cell lineage inference. The source code of scCRT is available at https://github.com/yuchen21-web/scCRT-for-scRNA-seq.


Assuntos
Algoritmos , Análise de Célula Única , Análise de Célula Única/métodos , Humanos , RNA-Seq/métodos , Biologia Computacional/métodos , Software , Análise de Sequência de RNA/métodos , Animais , Análise da Expressão Gênica de Célula Única
10.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38701413

RESUMO

With the emergence of large amount of single-cell RNA sequencing (scRNA-seq) data, the exploration of computational methods has become critical in revealing biological mechanisms. Clustering is a representative for deciphering cellular heterogeneity embedded in scRNA-seq data. However, due to the diversity of datasets, none of the existing single-cell clustering methods shows overwhelming performance on all datasets. Weighted ensemble methods are proposed to integrate multiple results to improve heterogeneity analysis performance. These methods are usually weighted by considering the reliability of the base clustering results, ignoring the performance difference of the same base clustering on different cells. In this paper, we propose a high-order element-wise weighting strategy based self-representative ensemble learning framework: scEWE. By assigning different base clustering weights to individual cells, we construct and optimize the consensus matrix in a careful and exquisite way. In addition, we extracted the high-order information between cells, which enhanced the ability to represent the similarity relationship between cells. scEWE is experimentally shown to significantly outperform the state-of-the-art methods, which strongly demonstrates the effectiveness of the method and supports the potential applications in complex single-cell data analytical problems.


Assuntos
Análise de Sequência de RNA , Análise de Célula Única , Análise de Célula Única/métodos , Análise por Conglomerados , Análise de Sequência de RNA/métodos , Algoritmos , Biologia Computacional/métodos , Humanos , RNA-Seq/métodos
11.
Genet Res (Camb) ; 2024: 4285171, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38715622

RESUMO

Bladder cancer has recently seen an alarming increase in global diagnoses, ascending as a predominant cause of cancer-related mortalities. Given this pressing scenario, there is a burgeoning need to identify effective biomarkers for both the diagnosis and therapeutic guidance of bladder cancer. This study focuses on evaluating the potential of high-definition computed tomography (CT) imagery coupled with RNA-sequencing analysis to accurately predict bladder tumor stages, utilizing deep residual networks. Data for this study, including CT images and RNA-Seq datasets for 82 high-grade bladder cancer patients, were sourced from the TCIA and TCGA databases. We employed Cox and lasso regression analyses to determine radiomics and gene signatures, leading to the identification of a three-factor radiomics signature and a four-gene signature in our bladder cancer cohort. ROC curve analyses underscored the strong predictive capacities of both these signatures. Furthermore, we formulated a nomogram integrating clinical features, radiomics, and gene signatures. This nomogram's AUC scores stood at 0.870, 0.873, and 0.971 for 1-year, 3-year, and 5-year predictions, respectively. Our model, leveraging radiomics and gene signatures, presents significant promise for enhancing diagnostic precision in bladder cancer prognosis, advocating for its clinical adoption.


Assuntos
Estadiamento de Neoplasias , Redes Neurais de Computação , Tomografia Computadorizada por Raios X , Neoplasias da Bexiga Urinária , Neoplasias da Bexiga Urinária/genética , Neoplasias da Bexiga Urinária/diagnóstico por imagem , Neoplasias da Bexiga Urinária/patologia , Humanos , Tomografia Computadorizada por Raios X/métodos , Masculino , Feminino , RNA-Seq/métodos , Idoso , Nomogramas , Pessoa de Meia-Idade , Biomarcadores Tumorais/genética , Curva ROC , Prognóstico , Transcriptoma , Radiômica
12.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38706317

RESUMO

Single-cell RNA sequencing (scRNA-seq) enables the exploration of cellular heterogeneity by analyzing gene expression profiles in complex tissues. However, scRNA-seq data often suffer from technical noise, dropout events and sparsity, hindering downstream analyses. Although existing works attempt to mitigate these issues by utilizing graph structures for data denoising, they involve the risk of propagating noise and fall short of fully leveraging the inherent data relationships, relying mainly on one of cell-cell or gene-gene associations and graphs constructed by initial noisy data. To this end, this study presents single-cell bilevel feature propagation (scBFP), two-step graph-based feature propagation method. It initially imputes zero values using non-zero values, ensuring that the imputation process does not affect the non-zero values due to dropout. Subsequently, it denoises the entire dataset by leveraging gene-gene and cell-cell relationships in the respective steps. Extensive experimental results on scRNA-seq data demonstrate the effectiveness of scBFP in various downstream tasks, uncovering valuable biological insights.


Assuntos
Análise de Sequência de RNA , Análise de Célula Única , Análise de Célula Única/métodos , Análise de Sequência de RNA/métodos , Humanos , Algoritmos , Perfilação da Expressão Gênica/métodos , Biologia Computacional/métodos , RNA-Seq/métodos
13.
Development ; 151(10)2024 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-38804879

RESUMO

Dorsal interneurons (dIs) in the spinal cord encode the perception of touch, pain, heat, itchiness and proprioception. Previous studies using genetic strategies in animal models have revealed important insights into dI development, but the molecular details of how dIs arise as distinct populations of neurons remain incomplete. We have developed a resource to investigate dI fate specification by combining a single-cell RNA-Seq atlas of mouse embryonic stem cell-derived dIs with pseudotime analyses. To validate this in silico resource as a useful tool, we used it to first identify genes that are candidates for directing the transition states that lead to distinct dI lineage trajectories, and then validated them using in situ hybridization analyses in the developing mouse spinal cord in vivo. We have also identified an endpoint of the dI5 lineage trajectory and found that dIs become more transcriptionally homogeneous during terminal differentiation. This study introduces a valuable tool for further discovery about the timing of gene expression during dI differentiation and demonstrates its utility in clarifying dI lineage relationships.


Assuntos
Diferenciação Celular , Linhagem da Célula , Regulação da Expressão Gênica no Desenvolvimento , Interneurônios , Medula Espinal , Animais , Camundongos , Medula Espinal/metabolismo , Medula Espinal/embriologia , Linhagem da Célula/genética , Interneurônios/metabolismo , Interneurônios/citologia , Diferenciação Celular/genética , Análise de Célula Única , Células-Tronco Embrionárias Murinas/metabolismo , Células-Tronco Embrionárias Murinas/citologia , RNA-Seq
14.
Methods Mol Biol ; 2808: 121-127, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38743366

RESUMO

During the infection of a host cell by an infectious agent, a series of gene expression changes occurs as a consequence of host-pathogen interactions. Unraveling this complex interplay is the key for understanding of microbial virulence and host response pathways, thus providing the basis for new molecular insights into the mechanisms of pathogenesis and the corresponding immune response. Dual RNA sequencing (dual RNA-seq) has been developed to simultaneously determine pathogen and host transcriptomes enabling both differential and coexpression analyses between the two partners as well as genome characterization in the case of RNA viruses. Here, we provide a detailed laboratory protocol and bioinformatics analysis guidelines for dual RNA-seq experiments focusing on - but not restricted to - measles virus (MeV) as a pathogen of interest. The application of dual RNA-seq technologies in MeV-infected patients can potentially provide valuable information on the structure of the viral RNA genome and on cellular innate immune responses and drive the discovery of new targets for antiviral therapy.


Assuntos
Genoma Viral , Interações Hospedeiro-Patógeno , Vírus do Sarampo , Sarampo , RNA Viral , Humanos , Sarampo/virologia , Sarampo/imunologia , Sarampo/genética , Vírus do Sarampo/genética , Vírus do Sarampo/patogenicidade , RNA Viral/genética , Interações Hospedeiro-Patógeno/genética , Interações Hospedeiro-Patógeno/imunologia , Biologia Computacional/métodos , Análise de Sequência de RNA/métodos , RNA-Seq/métodos , Transcriptoma , Perfilação da Expressão Gênica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos
15.
Nat Commun ; 15(1): 4055, 2024 May 14.
Artigo em Inglês | MEDLINE | ID: mdl-38744843

RESUMO

We introduce GRouNdGAN, a gene regulatory network (GRN)-guided reference-based causal implicit generative model for simulating single-cell RNA-seq data, in silico perturbation experiments, and benchmarking GRN inference methods. Through the imposition of a user-defined GRN in its architecture, GRouNdGAN simulates steady-state and transient-state single-cell datasets where genes are causally expressed under the control of their regulating transcription factors (TFs). Training on six experimental reference datasets, we show that our model captures non-linear TF-gene dependencies and preserves gene identities, cell trajectories, pseudo-time ordering, and technical and biological noise, with no user manipulation and only implicit parameterization. GRouNdGAN can synthesize cells under new conditions to perform in silico TF knockout experiments. Benchmarking various GRN inference algorithms reveals that GRouNdGAN effectively bridges the existing gap between simulated and biological data benchmarks of GRN inference algorithms, providing gold standard ground truth GRNs and realistic cells corresponding to the biological system of interest.


Assuntos
Algoritmos , Simulação por Computador , Redes Reguladoras de Genes , RNA-Seq , Análise de Célula Única , Análise de Célula Única/métodos , RNA-Seq/métodos , Humanos , Fatores de Transcrição/metabolismo , Fatores de Transcrição/genética , Biologia Computacional/métodos , Benchmarking , Análise de Sequência de RNA/métodos , Análise da Expressão Gênica de Célula Única
16.
PLoS One ; 19(5): e0302696, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38753612

RESUMO

Pathway enrichment analysis is a ubiquitous computational biology method to interpret a list of genes (typically derived from the association of large-scale omics data with phenotypes of interest) in terms of higher-level, predefined gene sets that share biological function, chromosomal location, or other common features. Among many tools developed so far, Gene Set Enrichment Analysis (GSEA) stands out as one of the pioneering and most widely used methods. Although originally developed for microarray data, GSEA is nowadays extensively utilized for RNA-seq data analysis. Here, we quantitatively assessed the performance of a variety of GSEA modalities and provide guidance in the practical use of GSEA in RNA-seq experiments. We leveraged harmonized RNA-seq datasets available from The Cancer Genome Atlas (TCGA) in combination with large, curated pathway collections from the Molecular Signatures Database to obtain cancer-type-specific target pathway lists across multiple cancer types. We carried out a detailed analysis of GSEA performance using both gene-set and phenotype permutations combined with four different choices for the Kolmogorov-Smirnov enrichment statistic. Based on our benchmarks, we conclude that the classic/unweighted gene-set permutation approach offered comparable or better sensitivity-vs-specificity tradeoffs across cancer types compared with other, more complex and computationally intensive permutation methods. Finally, we analyzed other large cohorts for thyroid cancer and hepatocellular carcinoma. We utilized a new consensus metric, the Enrichment Evidence Score (EES), which showed a remarkable agreement between pathways identified in TCGA and those from other sources, despite differences in cancer etiology. This finding suggests an EES-based strategy to identify a core set of pathways that may be complemented by an expanded set of pathways for downstream exploratory analysis. This work fills the existing gap in current guidelines and benchmarks for the use of GSEA with RNA-seq data and provides a framework to enable detailed benchmarking of other RNA-seq-based pathway analysis tools.


Assuntos
Benchmarking , RNA-Seq , Humanos , RNA-Seq/métodos , Biologia Computacional/métodos , Neoplasias/genética , Bases de Dados Genéticas , Perfilação da Expressão Gênica/métodos
17.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38725155

RESUMO

Single-cell RNA sequencing (scRNA-seq) experiments have become instrumental in developmental and differentiation studies, enabling the profiling of cells at a single or multiple time-points to uncover subtle variations in expression profiles reflecting underlying biological processes. Benchmarking studies have compared many of the computational methods used to reconstruct cellular dynamics; however, researchers still encounter challenges in their analysis due to uncertainty with respect to selecting the most appropriate methods and parameters. Even among universal data processing steps used by trajectory inference methods such as feature selection and dimension reduction, trajectory methods' performances are highly dataset-specific. To address these challenges, we developed Escort, a novel framework for evaluating a dataset's suitability for trajectory inference and quantifying trajectory properties influenced by analysis decisions. Escort evaluates the suitability of trajectory analysis and the combined effects of processing choices using trajectory-specific metrics. Escort navigates single-cell trajectory analysis through these data-driven assessments, reducing uncertainty and much of the decision burden inherent to trajectory inference analyses. Escort is implemented in an accessible R package and R/Shiny application, providing researchers with the necessary tools to make informed decisions during trajectory analysis and enabling new insights into dynamic biological processes at single-cell resolution.


Assuntos
RNA-Seq , Análise de Célula Única , Análise de Célula Única/métodos , RNA-Seq/métodos , Humanos , Biologia Computacional/métodos , Análise de Sequência de RNA/métodos , Software , Algoritmos , Perfilação da Expressão Gênica/métodos , Análise da Expressão Gênica de Célula Única
18.
Nat Commun ; 15(1): 4050, 2024 May 14.
Artigo em Inglês | MEDLINE | ID: mdl-38744866

RESUMO

Although more than half of all genes generate transcripts that differ in 3'UTR length, current analysis pipelines only quantify the amount but not the length of mRNA transcripts. 3'UTR length is determined by 3' end cleavage sites (CS). We map CS in more than 200 primary human and mouse cell types and increase CS annotations relative to the GENCODE database by 40%. Approximately half of all CS are used in few cell types, revealing that most genes only have one or two major 3' ends. We incorporate the CS annotations into a computational pipeline, called scUTRquant, for rapid, accurate, and simultaneous quantification of gene and 3'UTR isoform expression from single-cell RNA sequencing (scRNA-seq) data. When applying scUTRquant to data from 474 cell types and 2134 perturbations, we discover extensive 3'UTR length changes across cell types that are as widespread and coordinately regulated as gene expression changes but affect mostly different genes. Our data indicate that mRNA abundance and mRNA length are two largely independent axes of gene regulation that together determine the amount and spatial organization of protein synthesis.


Assuntos
Regiões 3' não Traduzidas , RNA Mensageiro , Análise de Célula Única , Regiões 3' não Traduzidas/genética , Humanos , Animais , Camundongos , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Análise de Célula Única/métodos , Análise de Sequência de RNA/métodos , Regulação da Expressão Gênica , RNA-Seq/métodos , Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Análise da Expressão Gênica de Célula Única
19.
Sci Rep ; 14(1): 10983, 2024 05 14.
Artigo em Inglês | MEDLINE | ID: mdl-38744869

RESUMO

Parkinson's disease (PD) is a complex neurodegenerative disorder without a cure. The onset of PD symptoms corresponds to 50% loss of midbrain dopaminergic (mDA) neurons, limiting early-stage understanding of PD. To shed light on early PD development, we study time series scRNA-seq datasets of mDA neurons obtained from patient-derived induced pluripotent stem cell differentiation. We develop a new data integration method based on Non-negative Matrix Tri-Factorization that integrates these datasets with molecular interaction networks, producing condition-specific "gene embeddings". By mining these embeddings, we predict 193 PD-related genes that are largely supported (49.7%) in the literature and are specific to the investigated PINK1 mutation. Enrichment analysis in Kyoto Encyclopedia of Genes and Genomes pathways highlights 10 PD-related molecular mechanisms perturbed during early PD development. Finally, investigating the top 20 prioritized genes reveals 12 previously unrecognized genes associated with PD that represent interesting drug targets.


Assuntos
Neurônios Dopaminérgicos , Doença de Parkinson , Doença de Parkinson/genética , Doença de Parkinson/patologia , Humanos , Neurônios Dopaminérgicos/metabolismo , Neurônios Dopaminérgicos/patologia , RNA-Seq/métodos , Células-Tronco Pluripotentes Induzidas/metabolismo , Mesencéfalo/metabolismo , Mesencéfalo/patologia , Redes Reguladoras de Genes , Mutação , Diferenciação Celular/genética , Multiômica , Análise da Expressão Gênica de Célula Única
20.
Technol Cancer Res Treat ; 23: 15330338241252610, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38766816

RESUMO

Background: Immunotherapy plays a significant role in the treatment of hepatocellular carcinoma (HCC). Members of the S100 protein family (S100s) have been widely implicated in the pathogenesis and progression of tumors. However, the exact mechanism by which S100s contribute to tumor immunity remains unclear. Methods: To explore the role of S100s in HCC immune cells, we collected and comparatively analyzed single-cell RNA sequencing (scRNA-seq) data of HCC and hepatitis B virus-associated HCC. By mapping cell classification and searching for S100s binding targets and downstream targets. Results: S100A6/S100A11 was differentially expressed in tumor T cells and involved in the nuclear factor (NF) κB pathway. Further investigation of the TCGA dataset revealed that patients with low S100A6/S100A11 expression had a better prognosis. Temporal cell trajectory analysis showed that the activation of the NF-κB pathway is at a critical stage and has an important impact on the tumor microenvironment. Conclusion: Our study revealed that S100A6/S100A11 could be involved in regulating the differentiation and cellular activity of T-cell subpopulations in HCC, and its low expression was positively correlated with prognosis. It may provide a new direction for immunotherapy of HCC and a theoretical basis for future clinical applications.


Assuntos
Carcinoma Hepatocelular , Regulação Neoplásica da Expressão Gênica , Neoplasias Hepáticas , RNA-Seq , Proteína A6 Ligante de Cálcio S100 , Proteínas S100 , Análise de Célula Única , Microambiente Tumoral , Humanos , Carcinoma Hepatocelular/genética , Carcinoma Hepatocelular/imunologia , Carcinoma Hepatocelular/patologia , Carcinoma Hepatocelular/metabolismo , Carcinoma Hepatocelular/etiologia , Neoplasias Hepáticas/genética , Neoplasias Hepáticas/imunologia , Neoplasias Hepáticas/patologia , Neoplasias Hepáticas/metabolismo , Proteínas S100/genética , Proteínas S100/metabolismo , Prognóstico , Proteína A6 Ligante de Cálcio S100/genética , Proteína A6 Ligante de Cálcio S100/metabolismo , Microambiente Tumoral/imunologia , Microambiente Tumoral/genética , NF-kappa B/metabolismo , Biomarcadores Tumorais , Perfilação da Expressão Gênica , Biologia Computacional/métodos , Transdução de Sinais , Proteínas de Ciclo Celular
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...