Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add more filters










Database
Language
Publication year range
1.
Article in English | MEDLINE | ID: mdl-35675245

ABSTRACT

The goal of quantification learning is to induce models capable of accurately predicting the class distribution for new bags of unseen examples. These models only return the prevalence of each class in the bag because prediction of individual examples is irrelevant in these tasks. A prototypical application of ordinal quantification is to predict the proportion of opinions that fall into each category from one to five stars. Ordinal quantification has hardly been studied in the literature, and in fact, only one approach has been proposed so far. This article presents a comprehensive study of ordinal quantification, analyzing the applicability of the most important algorithms devised for multiclass quantification and proposing three new methods that are based on matching distributions using Earth mover's distance (EMD). Empirical experiments compare 14 algorithms on synthetic and benchmark data. To statistically analyze the obtained results, we further introduce an EMD-based scoring function. The main conclusion is that methods using a criterion somehow related to EMD, including two of our proposals, obtain significantly better results.

2.
IEEE Trans Neural Netw Learn Syst ; 24(11): 1901-5, 2013 Nov.
Article in English | MEDLINE | ID: mdl-24808621

ABSTRACT

In many applications, the mistakes made by an automatic classifier are not equal, they have different costs. These problems may be solved using a cost-sensitive learning approach. The main idea is not to minimize the number of errors, but the total cost produced by such mistakes. This brief presents a new multiclass cost-sensitive algorithm, in which each example has attached its corresponding misclassification cost. Our proposal is theoretically well-founded and is designed to optimize cost-sensitive loss functions. This research was motivated by a real-world problem, the biomass estimation of several plankton taxonomic groups. In this particular application, our method improves the performance of traditional multiclass classification approaches that optimize the accuracy.


Subject(s)
Image Interpretation, Computer-Assisted/methods , Pattern Recognition, Automated/methods , Plankton/classification , Plankton/cytology , Support Vector Machine , Algorithms , Microscopy/methods , Plankton/isolation & purification , Reproducibility of Results , Sensitivity and Specificity
3.
J Comput Biol ; 17(12): 1711-23, 2010 Dec.
Article in English | MEDLINE | ID: mdl-21128857

ABSTRACT

The functional characterization of genes involved in many complex traits (phenotypes) of plants, animals, or humans can be studied from a computational point of view using different tools. We propose prediction--from the machine learning point of view--to search for the genetic basis of these traits. However, trying to predict an exact value of a phenotype can be too difficult to obtain a confident model, but predicting an approximation, in the form of an interval of values, can be easier. We shall see that trustable and useful models can be obtained from this relaxed formulation. These predictors may be built as extensions of conventional classifiers or regressors. Although the prediction performance in both cases are similar, we show that, from the classification field, it is straightforward to obtain a principled and scalable method to select a reduced set of features in these genetic learning tasks. We conclude by comparing the results so achieved in a real-world data set of barley plants with those obtained with state-of-the-art methods used in the biological literature.


Subject(s)
Models, Genetic , Quantitative Trait, Heritable , Algorithms , Humans , Logistic Models , Phenotype , Quantitative Trait Loci/genetics , ROC Curve
SELECTION OF CITATIONS
SEARCH DETAIL
...