Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
Add more filters










Database
Language
Publication year range
1.
Nat Methods ; 19(12): 1590-1598, 2022 12.
Article in English | MEDLINE | ID: mdl-36357692

ABSTRACT

RNA modifications such as m6A methylation form an additional layer of complexity in the transcriptome. Nanopore direct RNA sequencing can capture this information in the raw current signal for each RNA molecule, enabling the detection of RNA modifications using supervised machine learning. However, experimental approaches provide only site-level training data, whereas the modification status for each single RNA molecule is missing. Here we present m6Anet, a neural-network-based method that leverages the multiple instance learning framework to specifically handle missing read-level modification labels in site-level training data. m6Anet outperforms existing computational methods, shows similar accuracy as experimental approaches, and generalizes with high accuracy to different cell lines and species without retraining model parameters. In addition, we demonstrate that m6Anet captures the underlying read-level stoichiometry, which can be used to approximate differences in modification rates. Overall, m6Anet offers a tool to capture the transcriptome-wide identification and quantification of m6A from a single run of direct RNA sequencing.


Subject(s)
Nanopore Sequencing , RNA , RNA/genetics , RNA/metabolism , Sequence Analysis, RNA/methods , Methylation , Transcriptome
2.
Sensors (Basel) ; 22(15)2022 Aug 03.
Article in English | MEDLINE | ID: mdl-35957370

ABSTRACT

Mild cognitive impairment (MCI) is an early stage of cognitive decline or memory loss, commonly found among the elderly. A phonemic verbal fluency (PVF) task is a standard cognitive test that participants are asked to produce words starting with given letters, such as "F" in English and "ก" /k/ in Thai. With state-of-the-art machine learning techniques, features extracted from the PVF data have been widely used to detect MCI. The PVF features, including acoustic features, semantic features, and word grouping, have been studied in many languages but not Thai. However, applying the same PVF feature extraction methods used in English to Thai yields unpleasant results due to different language characteristics. This study performs analytical feature extraction on Thai PVF data to classify MCI patients. In particular, we propose novel approaches to extract features based on phonemic clustering (ability to cluster words by phonemes) and switching (ability to shift between clusters) for the Thai PVF data. The comparison results of the three classifiers revealed that the support vector machine performed the best with an area under the receiver operating characteristic curve (AUC) of 0.733 (N = 100). Furthermore, our implemented guidelines extracted efficient features, which support the machine learning models regarding MCI detection on Thai PVF data.


Subject(s)
Cognitive Dysfunction , Language , Aged , Cognitive Dysfunction/diagnosis , Humans , Machine Learning , Neuropsychological Tests , Semantics
3.
Sensors (Basel) ; 22(4)2022 Feb 17.
Article in English | MEDLINE | ID: mdl-35214483

ABSTRACT

The Montreal cognitive assessment (MoCA), a widely accepted screening tool for identifying patients with mild cognitive impairment (MCI), includes a language fluency test of verbal functioning; its scores are based on the number of unique correct words produced by the test taker. However, it is possible that unique words may be counted differently for various languages. This study focuses on Thai as a language that differs from English in terms of word combinations. We applied various automatic speech recognition (ASR) techniques to develop an assisted scoring system for the MoCA language fluency test with Thai language support. This was a challenge because Thai is a low-resource language for which domain-specific data are not publicly available, especially speech data from patients with MCIs. Furthermore, the great variety of pronunciation, intonation, tone, and accent of the patients, all of which might differ from healthy controls, bring more complexity to the model. We propose a hybrid time delay neural network hidden Markov model (TDNN-HMM) architecture for acoustic model training to create our ASR system that is robust to environmental noise and to the variation of voice quality impacted by MCI. The LOTUS Thai speech corpus was incorporated into the training set to improve the model's generalization. A preprocessing algorithm was implemented to reduce the background noise and improve the overall data quality before feeding data into the TDNN-HMM system for automatic word detection and language fluency score calculation. The results show that the TDNN-HMM model in combination with data augmentation using lattice-free maximum mutual information (LF-MMI) objective function provides a word error rate (WER) of 30.77%. To our knowledge, this is the first study to develop an ASR with Thai language support to automate the scoring system of MoCA's language fluency assessment.


Subject(s)
Language , Speech Perception , Humans , Mental Status and Dementia Tests , Speech , Thailand
4.
Trends Genet ; 38(3): 246-257, 2022 03.
Article in English | MEDLINE | ID: mdl-34711425

ABSTRACT

Nanopore sequencing provides signal data corresponding to the nucleotide motifs sequenced. Through machine learning-based methods, these signals are translated into long-read sequences that overcome the read size limit of short-read sequencing. However, analyzing the raw nanopore signal data provides many more opportunities beyond just sequencing genomes and transcriptomes: algorithms that use machine learning approaches to extract biological information from these signals allow the detection of DNA and RNA modifications, the estimation of poly(A) tail length, and the prediction of RNA secondary structures. In this review, we discuss how developments in machine learning methodologies contributed to more accurate basecalling and lower error rates, and how these methods enable new biological discoveries. We argue that direct nanopore sequencing of DNA and RNA provides a new dimensionality for genomics experiments and highlight challenges and future directions for computational approaches to extract the additional information provided by nanopore signal data.


Subject(s)
Nanopore Sequencing , Nanopores , Algorithms , Genomics , High-Throughput Nucleotide Sequencing/methods , Machine Learning , Sequence Analysis, DNA/methods
5.
Nat Biotechnol ; 39(11): 1394-1402, 2021 11.
Article in English | MEDLINE | ID: mdl-34282325

ABSTRACT

RNA modifications, such as N6-methyladenosine (m6A), modulate functions of cellular RNA species. However, quantifying differences in RNA modifications has been challenging. Here we develop a computational method, xPore, to identify differential RNA modifications from nanopore direct RNA sequencing (RNA-seq) data. We evaluate our method on transcriptome-wide m6A profiling data, demonstrating that xPore identifies positions of m6A sites at single-base resolution, estimates the fraction of modified RNA species in the cell and quantifies the differential modification rate across conditions. We apply xPore to direct RNA-seq data from six cell lines and multiple myeloma patient samples without a matched control sample and find that many m6A sites are preserved across cell types, whereas a subset exhibit significant differences in their modification rates. Our results show that RNA modifications can be identified from direct RNA-seq data with high accuracy, enabling analysis of differential modifications and expression from a single high-throughput experiment.


Subject(s)
Nanopore Sequencing , Nanopores , High-Throughput Nucleotide Sequencing , Humans , RNA/genetics , RNA/metabolism , RNA Processing, Post-Transcriptional/genetics , Sequence Analysis, RNA/methods , Transcriptome/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...