Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Sci Rep ; 12(1): 17580, 2022 10 20.
Artigo em Inglês | MEDLINE | ID: mdl-36266530

RESUMO

Data analysis has increasingly relied on machine learning in recent years. Since machines implement mathematical algorithms without knowing the physical nature of the problem, they may be accurate but lack the flexibility to move across different domains. This manuscript presents a machine-educating approach where a machine is equipped with a physical model, universal building blocks, and an unlabeled dataset from which it derives its decision criteria. Here, the concept of machine education is deployed to identify thin layers of organic materials using hyperspectral imaging (HSI). The measured spectra formed a nonlinear mixture of the unknown background materials and the target material spectra. The machine was educated to resolve this nonlinear mixing and identify the spectral signature of the target materials. The inputs for educating and testing the machine were a nonlinear mixing model, the spectra of the pure target materials (which are problem invariant), and the unlabeled HSI data. The educated machine is accurate, and its generalization capabilities outperform classical machines. When using the educated machine, the number of falsely identified samples is ~ 100 times lower than the classical machine. The probability for detection with the educated machine is 96% compared to 90% with the classical machine.


Assuntos
Imageamento Hiperespectral , Aprendizado de Máquina , Algoritmos , Máquina de Vetores de Suporte
2.
IEEE Trans Pattern Anal Mach Intell ; 44(3): 1320-1337, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-32813649

RESUMO

With increasing data volumes, the bottleneck in obtaining data for training a given learning task is the cost of manually labeling instances within the data. To alleviate this issue, various reduced label settings have been considered including semi-supervised learning, partial- or incomplete-label learning, multiple-instance learning, and active learning. Here, we focus on multiple-instance multiple-label learning with missing bag labels. Little research has been done for this challenging yet potentially powerful variant of incomplete supervision learning. We introduce a novel discriminative probabilistic model for missing labels in multiple-instance multiple-label learning. To address inference challenges, we introduce an efficient implementation of the EM algorithm for the model. Additionally, we consider an alternative inference approach that relies on maximizing the label-wise marginal likelihood of the proposed model instead of the joint likelihood. Numerical experiments on benchmark datasets illustrate the robustness of the proposed approach. In particular, comparison to state-of-the-art methods shows that our approach introduces a significantly smaller decrease in performance when the proportion of missing labels is increased.


Assuntos
Algoritmos , Aprendizado de Máquina Supervisionado , Modelos Estatísticos
3.
IEEE Trans Med Imaging ; 39(10): 3125-3136, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-32305904

RESUMO

Histopathological image analysis is a challenging task due to a diverse histology feature set as well as due to the presence of large non-informative regions in whole slide images. In this paper, we propose a multiple-instance learning (MIL) method for image-level classification as well as for annotating relevant regions in the image. In MIL, a common assumption is that negative bags contain only negative instances while positive bags contain one or more positive instances. This asymmetric assumption may be inappropriate for some application scenarios where negative bags also contain representative negative instances. We introduce a novel symmetric MIL framework associating each instance in a bag with an attribute which can be either negative, positive, or irrelevant. We extend the notion of relevance by introducing control over the number of relevant instances. We develop a probabilistic graphical model that incorporates the aforementioned paradigm and a corresponding computationally efficient inference for learning the model parameters and obtaining an instance level attribute-learning classifier. The effectiveness of the proposed method is evaluated on available histopathology datasets with promising results.


Assuntos
Processamento de Imagem Assistida por Computador , Modelos Estatísticos
4.
Forensic Sci Int ; 301: e55-e58, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-31153677

RESUMO

Efficient and safe detection of Bacillus anthracis spores (BAS) is a challenging task especially in bio-terror scenarios where the agent is concealed. We provide a proof-of-concept for the identification of concealed BAS inside mail envelopes using short-wave infrared hyperspectral imaging (SWIR-HSI). The spores and two other benign materials are identified according to their typical absorption spectrum. The identification process is based on the removal of the envelope signal using a new automatic new algorithm. This method may serve as a fast screening tool prior to using classical bioanalytical techniques.


Assuntos
Bacillus anthracis/isolamento & purificação , Raios Infravermelhos , Análise Espectral/métodos , Esporos Bacterianos/isolamento & purificação , Algoritmos , Bioterrorismo , Ciências Forenses/métodos , Humanos , Serviços Postais
5.
IEEE Trans Pattern Anal Mach Intell ; 39(12): 2381-2394, 2017 12.
Artigo em Inglês | MEDLINE | ID: mdl-28103189

RESUMO

Labeling data for classification requires significant human effort. To reduce labeling cost, instead of labeling every instance, a group of instances (bag) is labeled by a single bag label. Computer algorithms are then used to infer the label for each instance in a bag, a process referred to as instance annotation. This task is challenging due to the ambiguity regarding the instance labels. We propose a discriminative probabilistic model for the instance annotation problem and introduce an expectation maximization framework for inference, based on the maximum likelihood approach. For many probabilistic approaches, brute-force computation of the instance label posterior probability given its bag label is exponential in the number of instances in the bag. Our contribution is a dynamic programming method for computing the posterior that is linear in the number of instances. We evaluate our method using both benchmark and real world data sets, in the domain of bird song, image annotation, and activity recognition. In many cases, the proposed framework outperforms, sometimes significantly, the current state-of-the-art MIML learning methods, both in instance label prediction and bag label prediction.

6.
J Acoust Soc Am ; 131(6): 4640-50, 2012 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-22712937

RESUMO

Although field-collected recordings typically contain multiple simultaneously vocalizing birds of different species, acoustic species classification in this setting has received little study so far. This work formulates the problem of classifying the set of species present in an audio recording using the multi-instance multi-label (MIML) framework for machine learning, and proposes a MIML bag generator for audio, i.e., an algorithm which transforms an input audio signal into a bag-of-instances representation suitable for use with MIML classifiers. The proposed representation uses a 2D time-frequency segmentation of the audio signal, which can separate bird sounds that overlap in time. Experiments using audio data containing 13 species collected with unattended omnidirectional microphones in the H. J. Andrews Experimental Forest demonstrate that the proposed methods achieve high accuracy (96.1% true positives/negatives). Automated detection of bird species occurrence using MIML has many potential applications, particularly in long-term monitoring of remote sites, species distribution modeling, and conservation planning.


Assuntos
Acústica , Aves/classificação , Vocalização Animal/classificação , Algoritmos , Animais , Aves/fisiologia , Ruído/prevenção & controle , Mascaramento Perceptivo/fisiologia , Reprodutibilidade dos Testes , Espectrografia do Som , Gravação em Fita , Vocalização Animal/fisiologia
7.
Cytometry B Clin Cytom ; 80(5): 282-90, 2011 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-21462309

RESUMO

BACKGROUND: The role of flow cytometry (FCM) in diagnosing myelodysplastic syndromes (MDS) remains controversial, because analysis of myeloid maturation may involve subjective interpretation of sometimes subtle patterns on multiparameter FCM. METHODS: Using six-parameter marker combinations known to be useful in evaluating the myeloid compartment in MDS, we measured objective immunophenotypic differences between non-neoplastic (n = 25) and dysplastic (n = 17) granulopoiesis using a novel method, called Fisher information nonparametric embedding (FINE), that measures information distances among FCM datasets modeled as individual high-dimensional probability density functions, rather than as sets of two-dimensional histograms. Information-preserving component analysis (IPCA) was used to create information-optimized "rotated" two-dimensional histograms for visualizing myelopoietic immunophenotypes for each individual sample. RESULTS: There was a consistent trend of segregation of higher-grade MDS (RAEB and RCMD) from benign by FINE analysis. This difference was accentuated in cases with morphologic dysgranulopoiesis and in cases with clonal cytogenetic abnormalities. However, lower grades of MDS or cases that lacked morphologic dysgranulopoiesis showed much greater overlap with non-neoplastic cases. Two cases of reactive left shift were consistently embedded within the higher-grade MDS group. IPCA yielded two-dimensional histogram projections for each individual case by relative weighting of measured cellular characteristics, optimized for preserving information distances derived through FINE. CONCLUSIONS: Objective analysis by information geometry supports the conclusions of previous studies that there are immunophenotypic differences in the maturation patterns of benign granulopoiesis and high grade MDS, but also reinforces the known pitfalls of overlap between low-grade MDS and benign granulopoiesis and overlap between reactive granulocytic left shifts and dysplastic granulopoiesis.


Assuntos
Citometria de Fluxo/métodos , Imunofenotipagem , Síndromes Mielodisplásicas/diagnóstico , Mielopoese , Células da Medula Óssea/patologia , Granulócitos/patologia , Hematopoese , Humanos , Imunofenotipagem/métodos , Antígenos Comuns de Leucócito/metabolismo , Síndromes Mielodisplásicas/patologia , Pré-Leucemia/diagnóstico
8.
IEEE Trans Pattern Anal Mach Intell ; 31(11): 2093-8, 2009 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-19762935

RESUMO

We consider the problems of clustering, classification, and visualization of high-dimensional data when no straightforward euclidean representation exists. In this paper, we propose using the properties of information geometry and statistical manifolds in order to define similarities between data sets using the Fisher information distance. We will show that this metric can be approximated using entirely nonparametric methods, as the parameterization and geometry of the manifold is generally unknown. Furthermore, by using multidimensional scaling methods, we are able to reconstruct the statistical manifold in a low-dimensional euclidean space; enabling effective learning on the data. As a whole, we refer to our framework as Fisher Information Nonparametric Embedding (FINE) and illustrate its uses on practical problems, including a biomedical application and document classification.


Assuntos
Algoritmos , Inteligência Artificial , Análise por Conglomerados , Bases de Dados Factuais , Armazenamento e Recuperação da Informação/métodos , Modelos Teóricos , Reconhecimento Automatizado de Padrão/métodos , Interface Usuário-Computador , Simulação por Computador
9.
IEEE Trans Image Process ; 18(6): 1215-27, 2009 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-19380271

RESUMO

The application that motivates this paper is molecular imaging at the atomic level. When discretized at subatomic distances, the volume is inherently sparse. Noiseless measurements from an imaging technology can be modeled by convolution of the image with the system point spread function (psf). Such is the case with magnetic resonance force microscopy (MRFM), an emerging technology where imaging of an individual tobacco mosaic virus was recently demonstrated with nanometer resolution. We also consider additive white Gaussian noise (AWGN) in the measurements. Many prior works of sparse estimators have focused on the case when H has low coherence; however, the system matrix H in our application is the convolution matrix for the system psf. A typical convolution matrix has high coherence. This paper, therefore, does not assume a low coherence H. A discrete-continuous form of the Laplacian and atom at zero (LAZE) p.d.f. used by Johnstone and Silverman is formulated, and two sparse estimators derived by maximizing the joint p.d.f. of the observation and image conditioned on the hyperparameters. A thresholding rule that generalizes the hard and soft thresholding rule appears in the course of the derivation. This so-called hybrid thresholding rule, when used in the iterative thresholding framework, gives rise to the hybrid estimator, a generalization of the lasso. Estimates of the hyperparameters for the lasso and hybrid estimator are obtained via Stein's unbiased risk estimate (SURE). A numerical study with a Gaussian psf and two sparse images shows that the hybrid estimator outperforms the lasso.


Assuntos
Algoritmos , Processamento de Imagem Assistida por Computador/métodos , Microscopia de Força Atômica , Simulação por Computador , Espectroscopia de Ressonância Magnética , Modelos Estatísticos , Proteínas/química , Vírus/química
10.
Cytometry B Clin Cytom ; 76(1): 1-7, 2009 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-18642311

RESUMO

BACKGROUND: Clinical flow cytometry typically involves the sequential interpretation of two-dimensional histograms, usually culled from six or more cellular characteristics, following initial selection (gating) of cell populations based on a different subset of these characteristics. We examined the feasibility of instead treating gated n-parameter clinical flow cytometry data as objects embedded in n-dimensional space using principles of information geometry via a recently described method known as Fisher Information Non-parametric Embedding (FINE). METHODS: After initial selection of relevant cell populations through an iterative gating strategy, we converted four color (six-parameter) clinical flow cytometry datasets into six-dimensional probability density functions, and calculated differences among these distributions using the Kullback-Leibler divergence (a measurement of relative distributional entropy shown to be an appropriate approximation of Fisher information distance in certain types of statistical manifolds). Neighborhood maps based on Kullback-Leibler divergences were projected onto two dimensional displays for comparison. RESULTS: These methods resulted in the effective unsupervised clustering of cases of acute lymphoblastic leukemia from cases of expansion of physiologic B-cell precursors (hematogones) within a set of 54 patient samples. CONCLUSIONS: The treatment of flow cytometry datasets as objects embedded in high-dimensional space (as opposed to sequential two-dimensional analyses) harbors the potential for use as a decision-support tool in clinical practice or as a means for context-based archiving and searching of clinical flow cytometry data based on high-dimensional distribution patterns contained within stored list mode data. Additional studies will be needed to further test the effectiveness of this approach in clinical practice.


Assuntos
Citometria de Fluxo , Leucemia-Linfoma Linfoblástico de Células Precursoras/patologia , Células Precursoras de Linfócitos B/metabolismo , Adolescente , Adulto , Idoso , Algoritmos , Antígenos CD/metabolismo , Criança , Pré-Escolar , Análise por Conglomerados , Interpretação Estatística de Dados , Feminino , Humanos , Imunofenotipagem , Lactente , Masculino , Pessoa de Meia-Idade , Leucemia-Linfoma Linfoblástico de Células Precursoras/imunologia , Células Precursoras de Linfócitos B/imunologia , Adulto Jovem
11.
IEEE Stat Signal Processing Workshop ; 2009: 654-657, 2009 Sep 03.
Artigo em Inglês | MEDLINE | ID: mdl-25905108

RESUMO

Divergence measures find application in many areas of statistics, signal processing and machine learning, thus necessitating the need for good estimators of divergence measures. While several estimators of divergence measures have been proposed in literature, the performance of these estimators is not known. We propose a simple kNN density estimation based plug-in estimator for estimation of divergence measures. Based on the properties of kNN density estimates, we derive the bias, variance and mean square error xof the estimator in terms of the sample size, the dimension of the samples and the underlying probability distribution. Based on these results, we specify the optimal choice of tuning parameters for minimum mean square error. We also present results on convergence in distribution of the proposed estimator. These results will establish a basis for analyzing the performance of image registration methods that maximize divergence.

12.
J Opt Soc Am A Opt Image Sci Vis ; 23(8): 1835-45, 2006 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-16835639

RESUMO

The Schulz-Snyder iterative algorithm for phase retrieval attempts to recover a nonnegative function from its autocorrelation by minimizing the I-divergence between a measured autocorrelation and the autocorrelation of the estimated image. We illustrate that the Schulz-Snyder algorithm can become trapped in a local minimum of the I-divergence surface. To show that the estimates found are indeed local minima, sufficient conditions involving the gradient and the Hessian matrix of the I-divergence are given. Then we build a brief proof showing how an estimate that satisfies these conditions is a local minimum. The conditions are used to perform numerical tests determining local minimality of estimates. Along with the tests, related numerical issues are examined, and some interesting phenomena are discussed.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...