Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 45
Filter
1.
IEEE Trans Image Process ; 33: 3353-3368, 2024.
Article in English | MEDLINE | ID: mdl-38787667

ABSTRACT

Continual zero-shot learning (CZSL) aims to develop a model that accumulates historical knowledge to recognize unseen tasks, while eliminating catastrophic forgetting for seen tasks when learning new tasks. However, existing CZSL methods, while mitigating catastrophic forgetting for old tasks, often lead to negative transfer problem for new tasks by over-focusing on accumulating old knowledge and neglecting the plasticity of the model for learning new tasks. To tackle these problems, we propose PAMK, a prototype augmented multi-teacher knowledge transfer network that strikes a trade-off between recognition stability for old tasks and generalization plasticity for new tasks. PAMK consists of a prototype augmented contrastive generation (PACG) module and a multi-teacher knowledge transfer (MKT) module. To reduce the cumulative semantic decay of the class representation embedding and mitigate catastrophic forgetting, we propose a continual prototype augmentation strategy based on relevance scores in PACG. Furthermore, by introducing the prototype augmented semantic-visual contrastive loss, PACG promotes intra-class compactness for all classes across all tasks. MKT effectively accumulates semantic knowledge learned from old tasks to recognize new tasks via the proposed multi-teacher knowledge transfer, eliminating the negative transfer problem. Extensive experiments on various CZSL settings demonstrate the superior performance of PAMK compared to state-of-the-art methods. In particular, in the practical task-free CZSL setting, PAMK achieves impressive gains of 3.28%, 3.09% and 3.71% in mean harmonic accuracy on the CUB, AWA1, and AWA2 datasets, respectively.

2.
Article in English | MEDLINE | ID: mdl-37672372

ABSTRACT

Human-object interaction (HOI) detection involves identifying interactions represented as [Formula: see text] , requiring the localization of human-object pairs and interaction classification within an image. This work focuses on the challenge of detecting HOIs with unseen objects using the prevalent Transformer architecture. Our empirical analysis reveals that the performance degradation of novel HOI instances primarily arises from misclassifying unseen objects as confusable seen objects. To address this issue, we propose a similarity propagation (SP) scheme that leverages cosine similarity distance to regulate the prediction margin between seen and unseen objects. In addition, we introduce pseudo-supervision for unseen objects based on class semantic similarities during training. Furthermore, we incorporate semantic-aware instance-level and interaction-level contrastive losses with Transformer to enhance intraclass compactness and interclass separability, resulting in improved visual representations. Extensive experiments on two challenging benchmarks, V-COCO and HICO-DET, demonstrate the effectiveness of our model, outperforming current state-of-the-art methods under various zero-shot settings.

3.
IEEE Trans Pattern Anal Mach Intell ; 45(12): 15137-15153, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37725728

ABSTRACT

Multi-view learning is a widely studied topic in machine learning, which considers learning with multiple views of samples to improve the prediction performance. Even though some approaches have sprung up recently, it is still challenging to jointly explore information contained in different views. Multi-view deep Gaussian processes have shown strong advantages in unsupervised representation learning. However, they are limited when dealing with labeled multi-view data for supervised learning, and ignore the application potential of uncertainty estimation. In this paper, we propose a supervised multi-view deep Gaussian process model (named SupMvDGP), which uses the label of the views to further improve the performance, and takes the quantitative uncertainty estimation as a supplement to assist humans to make better use of prediction. According to the diversity of views, the SupMvDGP can establish asymmetric depth structure to better model different views, so as to make full use of the property of each view. We provide a variational inference method to effectively solve the complex model. Finally, we conduct comprehensive comparative experiments on multiple real world datasets to evaluate the performance of SupMvDGP. The experimental results show that the SupMvDGP achieves the state-of-the-art results in multiple tasks, which verifies the effectiveness and superiority of the proposed approach. Meanwhile, we provide a case study to show that the SupMvDGP has the ability to provide uncertainty estimation than alternative deep models, which can alert people to better treat the prediction results in high-risk applications.

4.
Article in English | MEDLINE | ID: mdl-37585332

ABSTRACT

Multiview learning has made significant progress in recent years. However, an implicit assumption is that multiview data are complete, which is often contrary to practical applications. Due to human or data acquisition equipment errors, what we actually get is partial multiview data, which existing multiview algorithms are limited to processing. Modeling complex dependencies between views in terms of consistency and complementarity remains challenging, especially in partial multiview data scenarios. To address the above issues, this article proposes a deep Gaussian cross-view generation model (named PMvCG), which aims to model views according to the principles of consistency and complementarity and eventually learn the comprehensive representation of partial multiview data. PMvCG can discover cross-view associations by learning view-sharing and view-specific features of different views in the representation space. The missing views can be reconstructed and are applied in turn to further optimize the model. The estimated uncertainty in the model is also considered and integrated into the representation to improve the performance. We design a variational inference and iterative optimization algorithm to solve PMvCG effectively. We conduct comprehensive experiments on multiple real-world datasets to validate the performance of PMvCG. We compare the PMvCG with various methods by applying the learned representation to clustering and classification. We also provide more insightful analysis to explore the PMvCG, such as convergence analysis, parameter sensitivity analysis, and the effect of uncertainty in the representation. The experimental results indicate that PMvCG obtains promising results and surpasses other comparative methods under different experimental settings.

5.
Entropy (Basel) ; 25(4)2023 Mar 24.
Article in English | MEDLINE | ID: mdl-37190347

ABSTRACT

The Hamiltonian Monte Carlo (HMC) sampling algorithm exploits Hamiltonian dynamics to construct efficient Markov Chain Monte Carlo (MCMC), which has become increasingly popular in machine learning and statistics. Since HMC uses the gradient information of the target distribution, it can explore the state space much more efficiently than random-walk proposals, but may suffer from high autocorrelation. In this paper, we propose Langevin Hamiltonian Monte Carlo (LHMC) to reduce the autocorrelation of the samples. Probabilistic inference involving multi-modal distributions is very difficult for dynamics-based MCMC samplers, which is easily trapped in the mode far away from other modes. To tackle this issue, we further propose a variational hybrid Monte Carlo (VHMC) which uses a variational distribution to explore the phase space and find new modes, and it is capable of sampling from multi-modal distributions effectively. A formal proof is provided that shows that the proposed method can converge to target distributions. Both synthetic and real datasets are used to evaluate its properties and performance. The experimental results verify the theory and show superior performance in multi-modal sampling.

6.
IEEE Trans Pattern Anal Mach Intell ; 45(4): 4981-4996, 2023 Apr.
Article in English | MEDLINE | ID: mdl-35969573

ABSTRACT

Multivariate time series clustering has become an important research topic in the time series learning task, which aims to discover the correlation among multiple sequences and partition multivariate time series data into several subsets. Although there are currently some methods that can handle this task, most of them fail to discover informative subsequences from multivariate time series instances. In this paper, we first propose a novel unsupervised shapelet learning with adaptive neighbors (USLA) model for learning salient multivariate subsequences (i.e., multivariate shapelets), where the importance of each variate can be auto-determined when given a candidate multivariate shapelet. USLA performs multivariate shapelet-transformed representation learning and local structure learning simultaneously, but the performance of USLA with multivariate shapelets of different lengths is comparable to that of isometric multivariate shapelets. In fact, the shapelet-transformed representations learned from multivariate shapelets of different lengths can all represent multivariate time series instances separately and often contain complementary information to each other. Therefore, we develop a novel multiview USLA (MUSLA) model which treats shapelet-transformed representations learned from shapelets of different lengths as different views. In this way, MUSLA learns the importance of each view and the neighbor graph matrix among multiview representations when candidate multivariate shapelets of different lengths are determined. Experimental results show that MUSLA outperforms other state-of-the-art multivariate time series algorithms on real-world multivariate time series datasets.

7.
Article in English | MEDLINE | ID: mdl-36070264

ABSTRACT

Despite incomplete multiview clustering (IMC) being widely studied in the past decade, it is still difficult to model the correlation among multiple views due to the absence of partial views. Most existing works for IMC only mine the correlation among multiple views from available views and ignore the importance of missing views. To address this issue, we propose a novel Incomplete Multiview Nonnegative representation learning model with Graph completion and Adaptive neighbors (IMNGA), which performs common graph learning, missing graph completion, and consensus nonnegative representation learning simultaneously. In IMNGA, the common graph on all views and the incomplete graph of each view are used to reconstruct the completed graph of the corresponding view, where the common graph satisfies the neighbor constraints of incomplete multiview data and consensus representation. IMNGA gets consensus representation by factorizing completed and incomplete graphs, where consensus representation satisfies the common graph constraint. IMNGA shows its effectiveness by outperforming other state-of-the-art methods.

8.
Entropy (Basel) ; 24(3)2022 Mar 16.
Article in English | MEDLINE | ID: mdl-35327925

ABSTRACT

Recently, flow models parameterized by neural networks have been used to design efficient Markov chain Monte Carlo (MCMC) transition kernels. However, inefficient utilization of gradient information of the target distribution or the use of volume-preserving flows limits their performance in sampling from multi-modal target distributions. In this paper, we treat the training procedure of the parameterized transition kernels in a different manner and exploit a novel scheme to train MCMC transition kernels. We divide the training process of transition kernels into the exploration stage and training stage, which can make full use of the gradient information of the target distribution and the expressive power of deep neural networks. The transition kernels are constructed with non-volume-preserving flows and trained in an adversarial form. The proposed method achieves significant improvement in effective sample size and mixes quickly to the target distribution. Empirical results validate that the proposed method is able to achieve low autocorrelation of samples and fast convergence rates, and outperforms other state-of-the-art parameterized transition kernels in varieties of challenging analytically described distributions and real world datasets.

9.
IEEE Trans Neural Netw Learn Syst ; 33(3): 1242-1253, 2022 Mar.
Article in English | MEDLINE | ID: mdl-33326385

ABSTRACT

Recently, multiview learning has been increasingly focused on machine learning. However, most existing multiview learning methods cannot directly deal with multiview sequential data, in which the inherent dynamical structure is often ignored. Especially, most traditional multiview machine learning methods assume that the items at different time slices within a sequence are independent of each other. In order to solve this problem, we propose a new multiview discriminant model based on conditional random fields (CRFs) to model multiview sequential data, called multiview CRF. It inherits the advantages of CRFs that build a relationship between items in each sequence. Moreover, by introducing specific features designed on the CRFs for multiview data, the multiview CRF not only considers the relationship among different views but also captures the correlation between the features from the same view. Particularly, some features can be reused or divided into different views to build an appropriate size of feature space. This helps to avoid underfitting problems caused by too small feature space or overfitting problems caused by too large feature space. In order to handle large-scale data, we use the stochastic gradient method to speed up our model. The experimental results on the text and video data illustrate the superiority of the proposed model.

10.
IEEE Trans Cybern ; 52(11): 12414-12428, 2022 Nov.
Article in English | MEDLINE | ID: mdl-34166216

ABSTRACT

Recently, the restricted Boltzmann machine (RBM) has aroused considerable interest in the multiview learning field. Although effectiveness is observed, like many existing multiview learning models, multiview RBM ignores the local manifold structure of multiview data. In this article, we first propose a novel graph RBM model, which preserves the data manifold structure and is amenable to Gibbs sampling. Then, we develop a multiview graph RBM model on the basis of the graph RBM, which performs local structural learning and multiview representation learning simultaneously. The proposed multiview model has the following merits: 1) it preserves the data manifold structure for multiview classification and 2) it performs view-consistent representation learning and view-specific representation learning simultaneously. The experimental results show that the proposed multiview model outperforms other state-of-the-art multiview classification algorithms.

11.
Entropy (Basel) ; 23(10)2021 Sep 30.
Article in English | MEDLINE | ID: mdl-34682013

ABSTRACT

Hidden Markov model (HMM) is a vital model for trajectory recognition. As the number of hidden states in HMM is important and hard to be determined, many nonparametric methods like hierarchical Dirichlet process HMMs and Beta process HMMs (BP-HMMs) have been proposed to determine it automatically. Among these methods, the sampled BP-HMM models the shared information among different classes, which has been proved to be effective in several trajectory recognition scenes. However, the existing BP-HMM maintains a state transition probability matrix for each trajectory, which is inconvenient for classification. Furthermore, the approximate inference of the BP-HMM is based on sampling methods, which usually takes a long time to converge. To develop an efficient nonparametric sequential model that can capture cross-class shared information for trajectory recognition, we propose a novel variational BP-HMM model, in which the hidden states can be shared among different classes and each class chooses its own hidden states and maintains a unified transition probability matrix. In addition, we derive a variational inference method for the proposed model, which is more efficient than sampling-based methods. Experimental results on a synthetic dataset and two real-world datasets show that compared with the sampled BP-HMM and other related models, the variational BP-HMM has better performance in trajectory recognition.

12.
IEEE Trans Med Imaging ; 40(1): 395-404, 2021 01.
Article in English | MEDLINE | ID: mdl-32991280

ABSTRACT

The most frequent extracranial solid tumors of childhood, named peripheral neuroblastic tumors (pNTs), are very challenging to diagnose due to their diversified categories and varying forms. Auxiliary diagnosis methods of such pediatric malignant cancers are highly needed to provide pathologists assistance and reduce the risk of misdiagnosis before treatments. In this paper, inspired by the particularity of microscopic pathology images, we integrate neural networks with the texture energy measure (TEM) and propose a novel network architecture named DetexNet (deep texture network). This method enforces the low-level representation pattern clearer via embedding the expert knowledge as prior, so that the network can seize the key information of a relatively small pathological dataset more smoothly. By applying and finetuning TEM filters in the bottom layer of a network, we greatly improve the performance of the baseline. We further pre-train the model on unlabeled data with an auto-encoder architecture and implement a color space conversion on input images. Two kinds of experiments under different assumptions in the condition of limited training data are performed, and in both of them, the proposed method achieves the best performance compared with other state-of-the-art models and doctor diagnosis.


Subject(s)
Neoplasms , Neural Networks, Computer , Child , Humans , Neoplasms/diagnostic imaging
13.
IEEE Trans Pattern Anal Mach Intell ; 43(8): 2682-2696, 2021 Aug.
Article in English | MEDLINE | ID: mdl-32078533

ABSTRACT

Multi-label classification is an important research topic in machine learning, for which exploiting label dependencies is an effective modeling principle. Recently, probabilistic models have shown great potential in discovering dependencies among labels. In this paper, motivated by the recent success of multi-view learning to improve the generalization performance, we propose a novel multi-view probabilistic model named latent conditional Bernoulli mixture (LCBM) for multi-label classification. LCBM is a generative model taking features from different views as inputs, and conditional on the latent subspace shared by the views a Bernoulli mixture model is adopted to build label dependencies. Inside each component of the mixture, the labels have a weak correlation which facilitates computational convenience. The mean field variational inference framework is used to carry out approximate posterior inference in the probabilistic model, where we propose a Gaussian mixture variational autoencoder (GMVAE) for effective posterior approximation. We further develop a scalable stochastic training algorithm for efficiently optimizing the model parameters and variational parameters, and derive an efficient prediction procedure based on greedy search. Experimental results on multiple benchmark datasets show that our approach outperforms other state-of-the-art methods under various metrics.

14.
IEEE Trans Artif Intell ; 2(2): 146-168, 2021 Apr.
Article in English | MEDLINE | ID: mdl-35308425

ABSTRACT

Clustering is a machine learning paradigm of dividing sample subjects into a number of groups such that subjects in the same groups are more similar to those in other groups. With advances in information acquisition technologies, samples can frequently be viewed from different angles or in different modalities, generating multi-view data. Multi-view clustering, that clusters subjects into subgroups using multi-view data, has attracted more and more attentions. Although MVC methods have been developed rapidly, there has not been enough survey to summarize and analyze the current progress. Therefore, we propose a novel taxonomy of the MVC approaches. Similar to other machine learning methods, we categorize them into generative and discriminative classes. In discriminative class, based on the way of view integration, we split it further into five groups: Common Eigenvector Matrix, Common Coefficient Matrix, Common Indicator Matrix, Direct Combination and Combination After Projection. Furthermore, we relate MVC to other topics: multi-view representation, ensemble clustering, multi-task clustering, multi-view supervised and semi-supervised learning. Several representative real-world applications are elaborated for practitioners. Some benchmark multi-view datasets are introduced and representative MVC algorithms from each group are empirically evaluated to analyze how they perform on benchmark datasets. To promote future development of MVC approaches, we point out several open problems that may require further investigation and thorough examination.

15.
IEEE Trans Neural Netw Learn Syst ; 32(7): 2875-2885, 2021 Jul.
Article in English | MEDLINE | ID: mdl-32701454

ABSTRACT

Gaussian process (GP) models are flexible nonparametric models widely used in a variety of tasks. Variational sparse GP (VSGP) scales GP models to large data sets by summarizing the posterior process with a set of inducing points. In this article, we extend VSGP to handle multiview data. We model each view with a VSGP and augment it with an additional set of inducing points. These VSGPs are coupled together by enforcing the means of their posteriors to agree at the locations of these inducing points. To learn these shared inducing points, we introduce an additional GP model that is defined in the concatenated feature space. Experiments on real-world data sets show that our multiview VSGP (MVSGP) model outperforms single-view VSGP consistently and is superior to state-of-the-art kernel-based multiview baselines for classification tasks.

16.
IEEE Trans Pattern Anal Mach Intell ; 43(12): 4453-4468, 2021 Dec.
Article in English | MEDLINE | ID: mdl-32750782

ABSTRACT

Multi-view representation learning is a promising and challenging research topic, which aims to integrate multiple data information from different views to improve the learning performance. The recent deep Gaussian processes (DGPs) have the advantages of good uncertainty estimates, powerful non-linear mapping ability and great generalization capability, which can be used as an excellent data representation learning method. However, DGPs only focus on single view data and are rarely applied to the multi-view scenario. In this paper, we propose a multi-view representation learning algorithm with deep Gaussian processes (named MvDGPs), which inherits the advantages of deep Gaussian processes and multi-view representation learning, and can learn more effective representation of multi-view data. The MvDGPs consist of two stages. The first stage is multi-view data representation learning, which is mainly used to learn more comprehensive representations of multi-view data. The second stage is classifier design, which aims to select an appropriate classifier to better employ the representations obtained in the first stage. In contrast with DGPs, MvDGPs support asymmetrical modeling depths for different views of data, resulting in better characterizations of the discrepancies among different views. Experimental results on real-world multi-view data sets verify the effectiveness of the proposed algorithm, which indicates that MvDGPs can integrate the complementary information in multiple views to discover a good representation of the data.

17.
Ann Hepatol ; 19(5): 570-572, 2020.
Article in English | MEDLINE | ID: mdl-32546443

ABSTRACT

INTRODUCTION AND OBJECTIVES: The role of hepatologists in the management of hepatocellular carcinoma (HCC) is not well defined. We conducted a cross-sectional study to assess the feasibility of hepatology-directed HCC treatment. PATIENTS: We evaluated 107 patients with newly diagnosed HCC, undergoing locoregional therapy as the first therapy between January 2017 and February 2019. RESULTS: The hepatologist directly participated in most of the microwave ablations. This descriptive cross-sectional study reveals the feasibility of the hepatologist-directed thermal ablation therapy, with decent outcome including response rate. CONCLUSIONS: Hepatologists can play a key role in the management of HCC in the current era of multidisciplinary team approach. Training fellows in performing ultrasound guided thermal ablation techniques would be one step forward in this direction.


Subject(s)
Ablation Techniques , Carcinoma, Hepatocellular/surgery , Gastroenterologists , Liver Neoplasms/surgery , Microwaves/therapeutic use , Ablation Techniques/adverse effects , Adult , Aged , Aged, 80 and over , Carcinoma, Hepatocellular/diagnostic imaging , Carcinoma, Hepatocellular/physiopathology , Clinical Competence , Cross-Sectional Studies , Feasibility Studies , Female , Humans , Learning Curve , Liver Neoplasms/diagnostic imaging , Liver Neoplasms/physiopathology , Male , Microwaves/adverse effects , Middle Aged , Treatment Outcome , Ultrasonography, Interventional
18.
IEEE Trans Neural Netw Learn Syst ; 31(9): 3442-3455, 2020 Sep.
Article in English | MEDLINE | ID: mdl-31670682

ABSTRACT

Canonical Correlation Analysis (CCA) is a popular multiview dimension reduction method, which aims to maximize the correlation between two views to find the common subspace shared by these two views. However, it can only deal with two-view data, while the number of views frequently exceeds two in many real applications. To handle data with more than two views, in the previous studies, either the pairwise correlation or the high-order correlation was employed. These two types of correlation define the relation of multiview data from different viewpoints, and both have special effects for view consistency. To obtain flexible view consistency, in this article, we propose multiview uncorrelated locality preserving projection (MULPP), which considers two types of correlation simultaneously. The MULPP also considers the complementary property of different views by preserving the local structures of all the views. To obtain multiple projections and minimize the redundancy of low-dimensional features, for each view, the MULPP makes the features extracted by different projections uncorrelated. The MULPP is solved by an iteration algorithm, and the convergence of the algorithm is proven. The experiments on Multiple Feature, Coil-100, 3Sources, and NUS-WIDE data sets demonstrate the effectiveness of MULPP.

19.
IEEE Trans Cybern ; 50(8): 3668-3681, 2020 Aug.
Article in English | MEDLINE | ID: mdl-31751262

ABSTRACT

Machine learning develops rapidly, which has made many theoretical breakthroughs and is widely applied in various fields. Optimization, as an important part of machine learning, has attracted much attention of researchers. With the exponential growth of data amount and the increase of model complexity, optimization methods in machine learning face more and more challenges. A lot of work on solving optimization problems or improving optimization methods in machine learning has been proposed successively. The systematic retrospect and summary of the optimization methods from the perspective of machine learning are of great significance, which can offer guidance for both developments of optimization and machine learning research. In this article, we first describe the optimization problems in machine learning. Then, we introduce the principles and progresses of commonly used optimization methods. Finally, we explore and give some challenges and open problems for the optimization in machine learning.

20.
Ann Hepatol ; 18(1): 11-13, 2019.
Article in English | MEDLINE | ID: mdl-31113579

ABSTRACT

In North America, the role of Hepatologists in treatment of hepatocellular carcinoma is limited. We conducted a pilot project wherein a Hepatologist participated directly in microwave ablation of HCC at an academic center in the United States (n = 14). The pilot project shows promising outcomes, with complete remission rate of 93%.


Subject(s)
Carcinoma, Hepatocellular/therapy , Liver Neoplasms/therapy , Microwaves/therapeutic use , Neoplasm Staging , Radiofrequency Ablation/methods , Adult , Aged , Biopsy , Carcinoma, Hepatocellular/diagnosis , Carcinoma, Hepatocellular/epidemiology , Feasibility Studies , Female , Follow-Up Studies , Humans , Incidence , Liver Neoplasms/diagnosis , Liver Neoplasms/epidemiology , Magnetic Resonance Imaging , Male , Middle Aged , North America/epidemiology , Pilot Projects , Prospective Studies , Survival Rate/trends , Tomography, X-Ray Computed , Treatment Outcome
SELECTION OF CITATIONS
SEARCH DETAIL
...