Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Artigo em Inglês | MEDLINE | ID: mdl-37027676

RESUMO

Long non-coding RNAs (LncRNAs) serve a vital role in regulating gene expressions and other biological processes. Differentiation of lncRNAs from protein-coding transcripts helps researchers dig into the mechanism of lncRNA formation and its downstream regulations related to various diseases. Previous works have been proposed to identify lncRNAs, including traditional bio-sequencing and machine learning approaches. Considering the tedious work of biological characteristic-based feature extraction procedures and inevitable artifacts during bio-sequencing processes, those lncRNA detection methods are not always satisfactory. Hence, in this work, we presented lncDLSM, a deep learning-based framework differentiating lncRNA from other protein-coding transcripts without dependencies on prior biological knowledge. lncDLSM is a helpful tool for identifying lncRNAs compared with other biological feature-based machine learning methods and can be applied to other species by transfer learning achieving satisfactory results. Further experiments showed that different species display distinct boundaries among distributions corresponding to the homology and the specificity among species, respectively. An online web server is provided to the community for easy use and efficient identification of lncRNA, available at http://39.106.16.168/lncDLSM.

2.
Nucleic Acids Res ; 50(21): e121, 2022 11 28.
Artigo em Inglês | MEDLINE | ID: mdl-36130281

RESUMO

Multimodal single-cell sequencing technologies provide unprecedented information on cellular heterogeneity from multiple layers of genomic readouts. However, joint analysis of two modalities without properly handling the noise often leads to overfitting of one modality by the other and worse clustering results than vanilla single-modality analysis. How to efficiently utilize the extra information from single cell multi-omics to delineate cell states and identify meaningful signal remains as a significant computational challenge. In this work, we propose a deep learning framework, named SAILERX, for efficient, robust, and flexible analysis of multi-modal single-cell data. SAILERX consists of a variational autoencoder with invariant representation learning to correct technical noises from sequencing process, and a multimodal data alignment mechanism to integrate information from different modalities. Instead of performing hard alignment by projecting both modalities to a shared latent space, SAILERX encourages the local structures of two modalities measured by pairwise similarities to be similar. This strategy is more robust against overfitting of noises, which facilitates various downstream analysis such as clustering, imputation, and marker gene detection. Furthermore, the invariant representation learning part enables SAILERX to perform integrative analysis on both multi- and single-modal datasets, making it an applicable and scalable tool for more general scenarios.


Assuntos
Genômica , Multiômica , Análise por Conglomerados , Análise de Célula Única
3.
Sensors (Basel) ; 22(3)2022 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-35161860

RESUMO

Epileptogenesis is the gradual dynamic process that progressively led to epilepsy, going through the latent stage to the chronic stage. During epileptogenesis, how the abnormal discharges make theta rhythm loss in the deep brain remains not clear. In this paper, a loss of theta rhythm was estimated based on time-frequency power using the longitudinal electroencephalography (EEG), recorded by deep brain electrodes (e.g., the intracortical microelectrodes such as stereo-EEG electrodes) with monitored epileptic spikes in a rat from the first region in the hippocampal circuit. Deep-brain EEG was collected from the period between adjacent sporadic interictal spikes (lasting 3.56 s-35.38 s) to the recovery period without spikes by videos while the rats were performing exploration. We found that loss of theta rhythm became more serious during the period between adjacent interictal spikes than during the recovery period without spike, and during epileptogenesis, more loss was observed at the acute stage than the chronic stage. We concluded that the emergence of the interictal spike was the direct cause of loss of theta rhythm, and the inhibitory effect of the interictal spike on ongoing theta rhythm was persistent as well as time dependent during epileptogenesis. With the help of the intracortical microelectrodes, this study provides a temporary proof of interictal spikes to produce ongoing theta rhythm loss, suggesting that the interictal spikes could correlate with the epileptogenesis process, display a time-dependent feature, and might be a potential biomarker to evaluate the deficits in theta-related memory in the brain.


Assuntos
Epilepsia do Lobo Temporal , Ritmo Teta , Animais , Encéfalo , Eletrodos , Eletroencefalografia , Hipocampo , Ratos
4.
Nucleic Acids Res ; 50(3): e14, 2022 02 22.
Artigo em Inglês | MEDLINE | ID: mdl-34792173

RESUMO

For many RNA molecules, the secondary structure is essential for the correct function of the RNA. Predicting RNA secondary structure from nucleotide sequences is a long-standing problem in genomics, but the prediction performance has reached a plateau over time. Traditional RNA secondary structure prediction algorithms are primarily based on thermodynamic models through free energy minimization, which imposes strong prior assumptions and is slow to run. Here, we propose a deep learning-based method, called UFold, for RNA secondary structure prediction, trained directly on annotated data and base-pairing rules. UFold proposes a novel image-like representation of RNA sequences, which can be efficiently processed by Fully Convolutional Networks (FCNs). We benchmark the performance of UFold on both within- and cross-family RNA datasets. It significantly outperforms previous methods on within-family datasets, while achieving a similar performance as the traditional methods when trained and tested on distinct RNA families. UFold is also able to predict pseudoknots accurately. Its prediction is fast with an inference time of about 160 ms per sequence up to 1500 bp in length. An online web server running UFold is available at https://ufold.ics.uci.edu. Code is available at https://github.com/uci-cbcl/UFold.


Assuntos
Aprendizado Profundo , RNA , Algoritmos , Pareamento de Bases , Humanos , Conformação de Ácido Nucleico , RNA/química , RNA/genética
5.
J Healthc Eng ; 2021: 1834123, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34745491

RESUMO

The bottleneck associated with the validation of the parameters of the entropy model has limited the application of this model to modern functional imaging technologies such as the resting-state functional magnetic resonance imaging (rfMRI). In this study, an optimization algorithm that could choose the parameters of the multiscale entropy (MSE) model was developed, while the optimized effectiveness for localizing the epileptogenic hemisphere was validated through the classification rate with a supervised machine learning method. The rfMRI data of 20 mesial temporal lobe epilepsy patients with positive indicators (the indicators of epileptogenic hemisphere in clinic) in the hippocampal formation on either left or right hemisphere (equally divided into two groups) on the structural MRI were collected and preprocessed. Then, three parameters in the MSE model were statistically optimized by both receiver operating characteristic (ROC) curve and the area under the ROC curve value in the sensitivity analysis, and the intergroup significance of optimized entropy values was utilized to confirm the biomarked brain areas sensitive to the epileptogenic hemisphere. Finally, the optimized entropy values of these biomarked brain areas were regarded as the feature vectors input for a support vector machine to classify the epileptogenic hemisphere, and the classification effectiveness was cross-validated. Nine biomarked brain areas were confirmed by the optimized entropy values, including medial superior frontal gyrus and superior parietal gyrus (p < .01). The mean classification accuracy was greater than 90%. It can be concluded that combination of the optimized MSE model with the machine learning model can accurately confirm the epileptogenic hemisphere by rfMRI. With the powerful information interaction capabilities of 5G communication, the epilepsy side-fixing algorithm that requires computing power can be integrated into a cloud platform. The demand side only needs to upload patient data to the service platform to realize the preoperative assessment of epilepsy.


Assuntos
Epilepsia do Lobo Temporal , Encéfalo , Entropia , Epilepsia do Lobo Temporal/diagnóstico por imagem , Lateralidade Funcional , Humanos , Aprendizado de Máquina , Imageamento por Ressonância Magnética
6.
Bioinformatics ; 37(Suppl_1): i317-i326, 2021 07 12.
Artigo em Inglês | MEDLINE | ID: mdl-34252968

RESUMO

MOTIVATION: Single-cell sequencing assay for transposase-accessible chromatin (scATAC-seq) provides new opportunities to dissect epigenomic heterogeneity and elucidate transcriptional regulatory mechanisms. However, computational modeling of scATAC-seq data is challenging due to its high dimension, extreme sparsity, complex dependencies and high sensitivity to confounding factors from various sources. RESULTS: Here, we propose a new deep generative model framework, named SAILER, for analyzing scATAC-seq data. SAILER aims to learn a low-dimensional nonlinear latent representation of each cell that defines its intrinsic chromatin state, invariant to extrinsic confounding factors like read depth and batch effects. SAILER adopts the conventional encoder-decoder framework to learn the latent representation but imposes additional constraints to ensure the independence of the learned representations from the confounding factors. Experimental results on both simulated and real scATAC-seq datasets demonstrate that SAILER learns better and biologically more meaningful representations of cells than other methods. Its noise-free cell embeddings bring in significant benefits in downstream analyses: clustering and imputation based on SAILER result in 6.9% and 18.5% improvements over existing methods, respectively. Moreover, because no matrix factorization is involved, SAILER can easily scale to process millions of cells. We implemented SAILER into a software package, freely available to all for large-scale scATAC-seq data analysis. AVAILABILITY AND IMPLEMENTATION: The software is publicly available at https://github.com/uci-cbcl/SAILER. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Sequenciamento de Cromatina por Imunoprecipitação , Análise de Célula Única , Epigenômica , Análise de Sequência de RNA , Software , Transposases
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...