Pesquisa | Portal Regional da BVS (teste)

1.

Attentive Learning Facilitates Generalization of Neural Networks.

Lei, Shiye; He, Fengxiang; Chen, Haowen; Tao, Dacheng.

IEEE Trans Neural Netw Learn Syst ; PP2024 Feb 07.

Artigo em Inglês | MEDLINE | ID: mdl-38324433

RESUMO

This article studies the generalization of neural networks (NNs) by examining how a network changes when trained on a training sample with or without out-of-distribution (OoD) examples. If the network's predictions are less influenced by fitting OoD examples, then the network learns attentively from the clean training set. A new notion, dataset-distraction stability, is proposed to measure the influence. Extensive CIFAR-10/100 experiments on the different VGG, ResNet, WideResNet, ViT architectures, and optimizers show a negative correlation between the dataset-distraction stability and generalizability. With the distraction stability, we decompose the learning process on the training set S into multiple learning processes on the subsets of S drawn from simpler distributions, i.e., distributions of smaller intrinsic dimensions (IDs), and furthermore, a tighter generalization bound is derived. Through attentive learning, miraculous generalization in deep learning can be explained and novel algorithms can also be designed.

2.

Understanding Deep Learning via Decision Boundary.

Lei, Shiye; He, Fengxiang; Yuan, Yancheng; Tao, Dacheng.

IEEE Trans Neural Netw Learn Syst ; PP2023 Nov 03.

Artigo em Inglês | MEDLINE | ID: mdl-37922185

RESUMO

This article discovers that the neural network (NN) with lower decision boundary (DB) variability has better generalizability. Two new notions, algorithm DB variability and (Ïµ, Î·) -data DB variability, are proposed to measure the DB variability from the algorithm and data perspectives. Extensive experiments show significant negative correlations between the DB variability and the generalizability. From the theoretical view, two lower bounds based on algorithm DB variability are proposed and do not explicitly depend on the sample size. We also prove an upper bound of order O((1/âm)+Ïµ+Î·log(1/Î·)) based on data DB variability. The bound is convenient to estimate without the requirement of labels and does not explicitly depend on the network size which is usually prohibitively large in deep learning.

3.

Channel Exchanging Networks for Multimodal and Multitask Dense Image Prediction.

Wang, Yikai; Sun, Fuchun; Huang, Wenbing; He, Fengxiang; Tao, Dacheng.

IEEE Trans Pattern Anal Mach Intell ; 45(5): 5481-5496, 2023 May.

Artigo em Inglês | MEDLINE | ID: mdl-36178992

RESUMO

Multimodal fusion and multitask learning are two vital topics in machine learning. Despite the fruitful progress, existing methods for both problems are still brittle to the same challenge-it remains dilemmatic to integrate the common information across modalities (resp. tasks) meanwhile preserving the specific patterns of each modality (resp. task). Besides, while they are actually closely related to each other, multimodal fusion and multitask learning are rarely explored within the same methodological framework before. In this paper, we propose Channel-Exchanging-Network (CEN) which is self-adaptive, parameter-free, and more importantly, applicable for multimodal and multitask dense image prediction. At its core, CEN adaptively exchanges channels between subnetworks of different modalities. Specifically, the channel exchanging process is self-guided by individual channel importance that is measured by the magnitude of Batch-Normalization (BN) scaling factor during training. For the application of dense image prediction, the validity of CEN is tested by four different scenarios: multimodal fusion, cycle multimodal fusion, multitask learning, and multimodal multitask learning. Extensive experiments on semantic segmentation via RGB-D data and image translation through multi-domain input verify the effectiveness of CEN compared to state-of-the-art methods. Detailed ablation studies have also been carried out, which demonstrate the advantage of each component we propose. Our code is available at https://github.com/yikaiw/CEN.

4.

Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting.

Xie, Zeke; He, Fengxiang; Fu, Shaopeng; Sato, Issei; Tao, Dacheng; Sugiyama, Masashi.

Neural Comput ; 33(8): 2163-2192, 2021 07 26.

Artigo em Inglês | MEDLINE | ID: mdl-34310675

RESUMO

Deep learning is often criticized by two serious issues that rarely exist in natural nervous systems: overfitting and catastrophic forgetting. It can even memorize randomly labeled data, which has little knowledge behind the instance-label pairs. When a deep network continually learns over time by accommodating new tasks, it usually quickly overwrites the knowledge learned from previous tasks. Referred to as the neural variability, it is well known in neuroscience that human brain reactions exhibit substantial variability even in response to the same stimulus. This mechanism balances accuracy and plasticity/flexibility in the motor learning of natural nervous systems. Thus, it motivates us to design a similar mechanism, named artificial neural variability (ANV), that helps artificial neural networks learn some advantages from "natural" neural networks. We rigorously prove that ANV plays as an implicit regularizer of the mutual information between the training data and the learned model. This result theoretically guarantees ANV a strictly improved generalizability, robustness to label noise, and robustness to catastrophic forgetting. We then devise a neural variable risk minimization (NVRM) framework and neural variable optimizers to achieve ANV for conventional network architectures in practice. The empirical studies demonstrate that NVRM can effectively relieve overfitting, label noise memorization, and catastrophic forgetting at negligible costs.

Assuntos

Aprendizado Profundo , Encéfalo , Humanos , Redes Neurais de Computação

5.

The impact of miR-9 in osteosarcoma: A study based on meta-analysis, TCGA data, and bioinformatics analysis.

Wu, Fengfeng; Jiang, Xuesheng; Wang, Qun; Lu, Qian; He, Fengxiang; Li, Jianyou; Li, Xiongfeng; Jin, Mingchao; Xu, Juntao.

Medicine (Baltimore) ; 99(35): e21902, 2020 Aug 28.

Artigo em Inglês | MEDLINE | ID: mdl-32871922

RESUMO

The function of miR-9 in osteosarcoma is not well-investigated and controversial. Therefore, we conducted meta-analysis to explore the role of miR-9 in osteosarcoma, and collected relevant TCGA data to further testify the result. In addition, bioinformatics analysis was conducted to investigate the mechanism and related pathways of miR-9-3p in osteosarcoma.Literature search was operated on databases up to February 19, 2020, including PubMed, Web of Science, Science Direct, Cochrane Central Register of Controlled Trials, and Wiley Online Library, China National Knowledge Infrastructure, China Biology Medicine disc, Chongqing VIP, and Wan Fang Data. The relation of miR-9 expression with survival outcome was estimated by hazard ratio (HRs) and 95% CIs. Meta-analysis was conducted on the Stata 12.0 (Stata Corporation, TX). To further assess the function of miR-9 in osteosarcoma, relevant data from the TCGA database was collected. Three databases, miRDB, miRPathDB 2.0, and Targetscan 7.2, were used for prediction of target genes. Genes present in these 3 databases were considered as predicted target genes of miR-9-3p. Venny 2.1 were used for intersection analysis. Subsequently, GO, KEGG, and PPI network analysis were conducted based on the overlapping target genes of miR-9-3p to explore the possible molecular mechanism in osteosarcoma.Meta-analysis shown that overexpression of miR-9 was associated with worse overall survival (OS) (HRâ=â4.180, 95% CI: 2.880-6.066, Pâ<â.001, Iâ=â23.5%). Based on TCGA data, osteosarcoma patients with overexpression of miR-9-3p (HRâ=â1.603, 95% CI: 1.028-2.499, Pâ=â.037) and miR-9-5p (HRâ=â1.698, 95% CI: 1.133-2.545, Pâ=â.01) also suffered poor OS. In bioinformatics analysis, 2 significant and important pathways were enriched: Wnt signaling pathway from gene ontology analysis (gene ontology:0016055, P-adjustâ=â.008); hippo signaling pathway from Kyoto Encyclopedia of Genes and Genomes analysis (P-adjustâ=â.007). Moreover, network analysis relevant protein-protein interaction was visualized, revealing 117 nodes and 161 edges.High miR-9 expression was associated with poor prognosis. Based on bioinformatics analysis, this study enhanced the understanding of the mechanism and related pathways of miR-9 in osteosarcoma.

Assuntos

MicroRNAs/genética , Osteossarcoma/genética , Biologia Computacional , Bases de Dados Genéticas , Perfilação da Expressão Gênica , Humanos , Prognóstico , Transdução de Sinais/genética

6.

Why ResNet Works? Residuals Generalize.

He, Fengxiang; Liu, Tongliang; Tao, Dacheng.

IEEE Trans Neural Netw Learn Syst ; 31(12): 5349-5362, 2020 12.

Artigo em Inglês | MEDLINE | ID: mdl-32031953

RESUMO

Residual connections significantly boost the performance of deep neural networks. However, few theoretical results address the influence of residuals on the hypothesis complexity and the generalization ability of deep neural networks. This article studies the influence of residual connections on the hypothesis complexity of the neural network in terms of the covering number of its hypothesis space. We first present an upper bound of the covering number of networks with residual connections. This bound shares a similar structure with that of neural networks without residual connections. This result suggests that moving a weight matrix or nonlinear activation from the bone to a vine would not increase the hypothesis space. Afterward, an O(1 / âN) margin-based multiclass generalization bound is obtained for ResNet, as an exemplary case of any deep neural network with residual connections. Generalization guarantees for similar state-of-the-art neural network architectures, such as DenseNet and ResNeXt, are straightforward. According to the obtained generalization bound, we should introduce regularization terms to control the magnitude of the norms of weight matrices not to increase too much, in practice, to ensure a good generalization ability, which justifies the technique of weight decay.

Assuntos

Redes Neurais de Computação , Algoritmos , Aprendizado Profundo , Generalização Psicológica

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA