Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
1.
PLoS Comput Biol ; 20(4): e1012006, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38578796

ABSTRACT

Single-cell RNA sequencing (scRNASeq) data plays a major role in advancing our understanding of developmental biology. An important current question is how to classify transcriptomic profiles obtained from scRNASeq experiments into the various cell types and identify the lineage relationship for individual cells. Because of the fast accumulation of datasets and the high dimensionality of the data, it has become challenging to explore and annotate single-cell transcriptomic profiles by hand. To overcome this challenge, automated classification methods are needed. Classical approaches rely on supervised training datasets. However, due to the difficulty of obtaining data annotated at single-cell resolution, we propose instead to take advantage of partial annotations. The partial label learning framework assumes that we can obtain a set of candidate labels containing the correct one for each data point, a simpler setting than requiring a fully supervised training dataset. We study and extend when needed state-of-the-art multi-class classification methods, such as SVM, kNN, prototype-based, logistic regression and ensemble methods, to the partial label learning framework. Moreover, we study the effect of incorporating the structure of the label set into the methods. We focus particularly on the hierarchical structure of the labels, as commonly observed in developmental processes. We show, on simulated and real datasets, that these extensions enable to learn from partially labeled data, and perform predictions with high accuracy, particularly with a nonlinear prototype-based method. We demonstrate that the performances of our methods trained with partially annotated data reach the same performance as fully supervised data. Finally, we study the level of uncertainty present in the partially annotated data, and derive some prescriptive results on the effect of this uncertainty on the accuracy of the partial label learning methods. Overall our findings show how hierarchical and non-hierarchical partial label learning strategies can help solve the problem of automated classification of single-cell transcriptomic profiles, interestingly these methods rely on a much less stringent type of annotated datasets compared to fully supervised learning methods.


Subject(s)
Gene Expression Profiling , Supervised Machine Learning , Uncertainty , Logistic Models
2.
Sensors (Basel) ; 23(10)2023 May 16.
Article in English | MEDLINE | ID: mdl-37430701

ABSTRACT

The rise in the use of social media networks has increased the prevalence of cyberbullying, and time is paramount to reduce the negative effects that derive from those behaviours on any social media platform. This paper aims to study the early detection problem from a general perspective by carrying out experiments over two independent datasets (Instagram and Vine), exclusively using users' comments. We used textual information from comments over baseline early detection models (fixed, threshold, and dual models) to apply three different methods of improving early detection. First, we evaluated the performance of Doc2Vec features. Finally, we also presented multiple instance learning (MIL) on early detection models and we assessed its performance. We applied timeawareprecision (TaP) as an early detection metric to asses the performance of the presented methods. We conclude that the inclusion of Doc2Vec features improves the performance of baseline early detection models by up to 79.6%. Moreover, multiple instance learning shows an important positive effect for the Vine dataset, where smaller post sizes and less use of the English language are present, with a further improvement of up to 13%, but no significant enhancement is shown for the Instagram dataset.

3.
Comput Methods Programs Biomed ; 197: 105730, 2020 Dec.
Article in English | MEDLINE | ID: mdl-32987228

ABSTRACT

BACKGROUND AND OBJECTIVE: In medical imaging, population studies have to overcome the differences that exist between individuals to identify invariant image features that can be used for diagnosis purposes. In functional neuroimaging, an appealing solution to identify neural coding principles that hold at the population level is inter-subject pattern analysis, i.e. to learn a predictive model on data from multiple subjects and evaluate its generalization performance on new subjects. Although it has gained popularity in recent years, its widespread adoption is still hampered by the blatant lack of a formal definition in the literature. In this paper, we precisely introduce the first principled formalization of inter-subject pattern analysis targeted at multivariate group analysis of functional neuroimaging. METHODS: We propose to frame inter-subject pattern analysis as a multi-source transductive transfer question, thus grounding it within several well defined machine learning settings and broadening the spectrum of usable algorithms. We describe two sets of inter-subject brain decoding experiments that use several open datasets: a magneto-encephalography study with 16 subjects and a functional magnetic resonance imaging paradigm with 100 subjects. We assess the relevance of our framework by performing model comparisons, where one brain decoding model exploits our formalization while others do not. RESULTS: The first set of experiments demonstrates the superiority of a brain decoder that uses subject-by-subject standardization compared to state of the art models that use other standardization schemes, making the case for the interest of the transductive and the multi-source components of our formalization The second set of experiments quantitatively shows that, even after such transformation, it is more difficult for a brain decoder to generalize to new participants rather than to new data from participants available in the training phase, thus highlighting the transfer gap that needs to be overcome. CONCLUSION: This paper describes the first formalization of inter-subject pattern analysis as a multi-source transductive transfer learning problem. We demonstrate the added value of this formalization using proof-of-concept experiments on several complementary functional neuroimaging datasets. This work should contribute to popularize inter-subject pattern analysis for functional neuroimaging population studies and pave the road for future methodological innovations.


Subject(s)
Brain Mapping , Functional Neuroimaging , Brain/diagnostic imaging , Humans , Magnetic Resonance Imaging , Multivariate Analysis , Neuroimaging
4.
BMC Bioinformatics ; 16: 138, 2015 Apr 30.
Article in English | MEDLINE | ID: mdl-25925131

ABSTRACT

BACKGROUND: This article provides an overview of the first BIOASQ challenge, a competition on large-scale biomedical semantic indexing and question answering (QA), which took place between March and September 2013. BIOASQ assesses the ability of systems to semantically index very large numbers of biomedical scientific articles, and to return concise and user-understandable answers to given natural language questions by combining information from biomedical articles and ontologies. RESULTS: The 2013 BIOASQ competition comprised two tasks, Task 1a and Task 1b. In Task 1a participants were asked to automatically annotate new PUBMED documents with MESH headings. Twelve teams participated in Task 1a, with a total of 46 system runs submitted, and one of the teams performing consistently better than the MTI indexer used by NLM to suggest MESH headings to curators. Task 1b used benchmark datasets containing 29 development and 282 test English questions, along with gold standard (reference) answers, prepared by a team of biomedical experts from around Europe and participants had to automatically produce answers. Three teams participated in Task 1b, with 11 system runs. The BIOASQ infrastructure, including benchmark datasets, evaluation mechanisms, and the results of the participants and baseline methods, is publicly available. CONCLUSIONS: A publicly available evaluation infrastructure for biomedical semantic indexing and QA has been developed, which includes benchmark datasets, and can be used to evaluate systems that: assign MESH headings to published articles or to English questions; retrieve relevant RDF triples from ontologies, relevant articles and snippets from PUBMED Central; produce "exact" and paragraph-sized "ideal" answers (summaries). The results of the systems that participated in the 2013 BIOASQ competition are promising. In Task 1a one of the systems performed consistently better from the NLM's MTI indexer. In Task 1b the systems received high scores in the manual evaluation of the "ideal" answers; hence, they produced high quality summaries as answers. Overall, BIOASQ helped obtain a unified view of how techniques from text classification, semantic indexing, document and passage retrieval, question answering, and text summarization can be combined to allow biomedical experts to obtain concise, user-understandable answers to questions reflecting their real information needs.


Subject(s)
Abstracting and Indexing/methods , Algorithms , Medical Subject Headings , Natural Language Processing , PubMed , Semantics , Software , Humans , National Library of Medicine (U.S.) , United States
5.
IEEE Trans Pattern Anal Mach Intell ; 29(2): 205-17, 2007 Feb.
Article in English | MEDLINE | ID: mdl-17170475

ABSTRACT

We investigate a new approach for online handwritten shape recognition. Interesting features of this approach include learning without manual tuning, learning from very few training samples, incremental learning of characters, and adaptation to the user-specific needs. The proposed system can deal with two-dimensional graphical shapes such as Latin and Asian characters, command gestures, symbols, small drawings, and geometric shapes. It can be used as a building block for a series of recognition tasks with many applications.


Subject(s)
Algorithms , Artificial Intelligence , Electronic Data Processing/methods , Handwriting , Image Interpretation, Computer-Assisted/methods , Information Storage and Retrieval/methods , Pattern Recognition, Automated/methods , Humans , Image Enhancement/methods , Markov Chains , Models, Statistical , Online Systems
SELECTION OF CITATIONS
SEARCH DETAIL
...