Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Artigo em Inglês | MEDLINE | ID: mdl-38324433

RESUMO

This article studies the generalization of neural networks (NNs) by examining how a network changes when trained on a training sample with or without out-of-distribution (OoD) examples. If the network's predictions are less influenced by fitting OoD examples, then the network learns attentively from the clean training set. A new notion, dataset-distraction stability, is proposed to measure the influence. Extensive CIFAR-10/100 experiments on the different VGG, ResNet, WideResNet, ViT architectures, and optimizers show a negative correlation between the dataset-distraction stability and generalizability. With the distraction stability, we decompose the learning process on the training set S into multiple learning processes on the subsets of S drawn from simpler distributions, i.e., distributions of smaller intrinsic dimensions (IDs), and furthermore, a tighter generalization bound is derived. Through attentive learning, miraculous generalization in deep learning can be explained and novel algorithms can also be designed.

2.
Artigo em Inglês | MEDLINE | ID: mdl-37922185

RESUMO

This article discovers that the neural network (NN) with lower decision boundary (DB) variability has better generalizability. Two new notions, algorithm DB variability and (ϵ, η) -data DB variability, are proposed to measure the DB variability from the algorithm and data perspectives. Extensive experiments show significant negative correlations between the DB variability and the generalizability. From the theoretical view, two lower bounds based on algorithm DB variability are proposed and do not explicitly depend on the sample size. We also prove an upper bound of order O((1/√m)+ϵ+ηlog(1/η)) based on data DB variability. The bound is convenient to estimate without the requirement of labels and does not explicitly depend on the network size which is usually prohibitively large in deep learning.

3.
IEEE Trans Pattern Anal Mach Intell ; 45(5): 5481-5496, 2023 May.
Artigo em Inglês | MEDLINE | ID: mdl-36178992

RESUMO

Multimodal fusion and multitask learning are two vital topics in machine learning. Despite the fruitful progress, existing methods for both problems are still brittle to the same challenge-it remains dilemmatic to integrate the common information across modalities (resp. tasks) meanwhile preserving the specific patterns of each modality (resp. task). Besides, while they are actually closely related to each other, multimodal fusion and multitask learning are rarely explored within the same methodological framework before. In this paper, we propose Channel-Exchanging-Network (CEN) which is self-adaptive, parameter-free, and more importantly, applicable for multimodal and multitask dense image prediction. At its core, CEN adaptively exchanges channels between subnetworks of different modalities. Specifically, the channel exchanging process is self-guided by individual channel importance that is measured by the magnitude of Batch-Normalization (BN) scaling factor during training. For the application of dense image prediction, the validity of CEN is tested by four different scenarios: multimodal fusion, cycle multimodal fusion, multitask learning, and multimodal multitask learning. Extensive experiments on semantic segmentation via RGB-D data and image translation through multi-domain input verify the effectiveness of CEN compared to state-of-the-art methods. Detailed ablation studies have also been carried out, which demonstrate the advantage of each component we propose. Our code is available at https://github.com/yikaiw/CEN.

4.
Neural Comput ; 33(8): 2163-2192, 2021 07 26.
Artigo em Inglês | MEDLINE | ID: mdl-34310675

RESUMO

Deep learning is often criticized by two serious issues that rarely exist in natural nervous systems: overfitting and catastrophic forgetting. It can even memorize randomly labeled data, which has little knowledge behind the instance-label pairs. When a deep network continually learns over time by accommodating new tasks, it usually quickly overwrites the knowledge learned from previous tasks. Referred to as the neural variability, it is well known in neuroscience that human brain reactions exhibit substantial variability even in response to the same stimulus. This mechanism balances accuracy and plasticity/flexibility in the motor learning of natural nervous systems. Thus, it motivates us to design a similar mechanism, named artificial neural variability (ANV), that helps artificial neural networks learn some advantages from "natural" neural networks. We rigorously prove that ANV plays as an implicit regularizer of the mutual information between the training data and the learned model. This result theoretically guarantees ANV a strictly improved generalizability, robustness to label noise, and robustness to catastrophic forgetting. We then devise a neural variable risk minimization (NVRM) framework and neural variable optimizers to achieve ANV for conventional network architectures in practice. The empirical studies demonstrate that NVRM can effectively relieve overfitting, label noise memorization, and catastrophic forgetting at negligible costs.


Assuntos
Aprendizado Profundo , Encéfalo , Humanos , Redes Neurais de Computação
5.
Medicine (Baltimore) ; 99(35): e21902, 2020 Aug 28.
Artigo em Inglês | MEDLINE | ID: mdl-32871922

RESUMO

The function of miR-9 in osteosarcoma is not well-investigated and controversial. Therefore, we conducted meta-analysis to explore the role of miR-9 in osteosarcoma, and collected relevant TCGA data to further testify the result. In addition, bioinformatics analysis was conducted to investigate the mechanism and related pathways of miR-9-3p in osteosarcoma.Literature search was operated on databases up to February 19, 2020, including PubMed, Web of Science, Science Direct, Cochrane Central Register of Controlled Trials, and Wiley Online Library, China National Knowledge Infrastructure, China Biology Medicine disc, Chongqing VIP, and Wan Fang Data. The relation of miR-9 expression with survival outcome was estimated by hazard ratio (HRs) and 95% CIs. Meta-analysis was conducted on the Stata 12.0 (Stata Corporation, TX). To further assess the function of miR-9 in osteosarcoma, relevant data from the TCGA database was collected. Three databases, miRDB, miRPathDB 2.0, and Targetscan 7.2, were used for prediction of target genes. Genes present in these 3 databases were considered as predicted target genes of miR-9-3p. Venny 2.1 were used for intersection analysis. Subsequently, GO, KEGG, and PPI network analysis were conducted based on the overlapping target genes of miR-9-3p to explore the possible molecular mechanism in osteosarcoma.Meta-analysis shown that overexpression of miR-9 was associated with worse overall survival (OS) (HR = 4.180, 95% CI: 2.880-6.066, P < .001, I = 23.5%). Based on TCGA data, osteosarcoma patients with overexpression of miR-9-3p (HR = 1.603, 95% CI: 1.028-2.499, P = .037) and miR-9-5p (HR = 1.698, 95% CI: 1.133-2.545, P = .01) also suffered poor OS. In bioinformatics analysis, 2 significant and important pathways were enriched: Wnt signaling pathway from gene ontology analysis (gene ontology:0016055, P-adjust = .008); hippo signaling pathway from Kyoto Encyclopedia of Genes and Genomes analysis (P-adjust = .007). Moreover, network analysis relevant protein-protein interaction was visualized, revealing 117 nodes and 161 edges.High miR-9 expression was associated with poor prognosis. Based on bioinformatics analysis, this study enhanced the understanding of the mechanism and related pathways of miR-9 in osteosarcoma.


Assuntos
MicroRNAs/genética , Osteossarcoma/genética , Biologia Computacional , Bases de Dados Genéticas , Perfilação da Expressão Gênica , Humanos , Prognóstico , Transdução de Sinais/genética
6.
IEEE Trans Neural Netw Learn Syst ; 31(12): 5349-5362, 2020 12.
Artigo em Inglês | MEDLINE | ID: mdl-32031953

RESUMO

Residual connections significantly boost the performance of deep neural networks. However, few theoretical results address the influence of residuals on the hypothesis complexity and the generalization ability of deep neural networks. This article studies the influence of residual connections on the hypothesis complexity of the neural network in terms of the covering number of its hypothesis space. We first present an upper bound of the covering number of networks with residual connections. This bound shares a similar structure with that of neural networks without residual connections. This result suggests that moving a weight matrix or nonlinear activation from the bone to a vine would not increase the hypothesis space. Afterward, an O(1 / √N) margin-based multiclass generalization bound is obtained for ResNet, as an exemplary case of any deep neural network with residual connections. Generalization guarantees for similar state-of-the-art neural network architectures, such as DenseNet and ResNeXt, are straightforward. According to the obtained generalization bound, we should introduce regularization terms to control the magnitude of the norms of weight matrices not to increase too much, in practice, to ensure a good generalization ability, which justifies the technique of weight decay.


Assuntos
Redes Neurais de Computação , Algoritmos , Aprendizado Profundo , Generalização Psicológica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...