Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 27
Filtrar
Mais filtros











Base de dados
Intervalo de ano de publicação
1.
BMC Bioinformatics ; 25(1): 269, 2024 Aug 20.
Artigo em Inglês | MEDLINE | ID: mdl-39164632

RESUMO

BACKGROUND: Fluorescence microscopy (FM) is an important and widely adopted biological imaging technique. Segmentation is often the first step in quantitative analysis of FM images. Deep neural networks (DNNs) have become the state-of-the-art tools for image segmentation. However, their performance on natural images may collapse under certain image corruptions or adversarial attacks. This poses real risks to their deployment in real-world applications. Although the robustness of DNN models in segmenting natural images has been studied extensively, their robustness in segmenting FM images remains poorly understood RESULTS: To address this deficiency, we have developed an assay that benchmarks robustness of DNN segmentation models using datasets of realistic synthetic 2D FM images with precisely controlled corruptions or adversarial attacks. Using this assay, we have benchmarked robustness of ten representative models such as DeepLab and Vision Transformer. We find that models with good robustness on natural images may perform poorly on FM images. We also find new robustness properties of DNN models and new connections between their corruption robustness and adversarial robustness. To further assess the robustness of the selected models, we have also benchmarked them on real microscopy images of different modalities without using simulated degradation. The results are consistent with those obtained on the realistic synthetic images, confirming the fidelity and reliability of our image synthesis method as well as the effectiveness of our assay. CONCLUSIONS: Based on comprehensive benchmarking experiments, we have found distinct robustness properties of deep neural networks in semantic segmentation of FM images. Based on the findings, we have made specific recommendations on selection and design of robust models for FM image segmentation.


Assuntos
Benchmarking , Processamento de Imagem Assistida por Computador , Microscopia de Fluorescência , Redes Neurais de Computação , Microscopia de Fluorescência/métodos , Benchmarking/métodos , Processamento de Imagem Assistida por Computador/métodos , Semântica , Aprendizado Profundo , Algoritmos , Humanos
2.
Front Comput Neurosci ; 18: 1388166, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39114083

RESUMO

A good theory of mathematical beauty is more practical than any current observation, as new predictions about physical reality can be self-consistently verified. This belief applies to the current status of understanding deep neural networks including large language models and even the biological intelligence. Toy models provide a metaphor of physical reality, allowing mathematically formulating the reality (i.e., the so-called theory), which can be updated as more conjectures are justified or refuted. One does not need to present all details in a model, but rather, more abstract models are constructed, as complex systems such as the brains or deep networks have many sloppy dimensions but much less stiff dimensions that strongly impact macroscopic observables. This type of bottom-up mechanistic modeling is still promising in the modern era of understanding the natural or artificial intelligence. Here, we shed light on eight challenges in developing theory of intelligence following this theoretical paradigm. Theses challenges are representation learning, generalization, adversarial robustness, continual learning, causal learning, internal model of the brain, next-token prediction, and the mechanics of subjective experience.

3.
Sensors (Basel) ; 24(16)2024 Aug 22.
Artigo em Inglês | MEDLINE | ID: mdl-39205134

RESUMO

LiDAR sensors have been shown to generate data with various common corruptions, which seriously affect their applications in 3D vision tasks, particularly object detection. At the same time, it has been demonstrated that traditional defense strategies, including adversarial training, are prone to suffering from gradient confusion during training. Moreover, they can only improve their robustness against specific types of data corruption. In this work, we propose LiDARPure, which leverages the powerful generation ability of diffusion models to purify corruption in the LiDAR scene data. By dividing the entire scene into voxels to facilitate the processes of diffusion and reverse diffusion, LiDARPure overcomes challenges induced from adversarial training, such as sparse point clouds in large-scale LiDAR data and gradient confusion. In addition, we utilize the latent geometric features of a scene as a condition to assist the generation of diffusion models. Detailed experiments show that LiDARPure can effectively purify 19 common types of LiDAR data corruption. Further evaluation results demonstrate that it can improve the average precision of 3D object detectors to an extent of 20% in the face of data corruption, much higher than existing defence strategies.

4.
Sensors (Basel) ; 24(9)2024 May 06.
Artigo em Inglês | MEDLINE | ID: mdl-38733060

RESUMO

Deep neural networks (DNNs) are increasingly important in the medical diagnosis of electrocardiogram (ECG) signals. However, research has shown that DNNs are highly vulnerable to adversarial examples, which can be created by carefully crafted perturbations. This vulnerability can lead to potential medical accidents. This poses new challenges for the application of DNNs in the medical diagnosis of ECG signals. This paper proposes a novel network Channel Activation Suppression with Lipschitz Constraints Net (CASLCNet), which employs the Channel-wise Activation Suppressing (CAS) strategy to dynamically adjust the contribution of different channels to the class prediction and uses the 1-Lipschitz's ℓ∞ distance network as a robust classifier to reduce the impact of adversarial perturbations on the model itself in order to increase the adversarial robustness of the model. The experimental results demonstrate that CASLCNet achieves ACCrobust scores of 91.03% and 83.01% when subjected to PGD attacks on the MIT-BIH and CPSC2018 datasets, respectively, which proves that the proposed method in this paper enhances the model's adversarial robustness while maintaining a high accuracy rate.


Assuntos
Algoritmos , Eletrocardiografia , Redes Neurais de Computação , Eletrocardiografia/métodos , Humanos , Processamento de Sinais Assistido por Computador
5.
Sensors (Basel) ; 24(8)2024 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-38676241

RESUMO

Recently, Machine Learning (ML)-based solutions have been widely adopted to tackle the wide range of security challenges that have affected the progress of the Internet of Things (IoT) in various domains. Despite the reported promising results, the ML-based Intrusion Detection System (IDS) proved to be vulnerable to adversarial examples, which pose an increasing threat. In fact, attackers employ Adversarial Machine Learning (AML) to cause severe performance degradation and thereby evade detection systems. This promoted the need for reliable defense strategies to handle performance and ensure secure networks. This work introduces RobEns, a robust ensemble framework that aims at: (i) exploiting state-of-the-art ML-based models alongside ensemble models for IDSs in the IoT network; (ii) investigating the impact of evasion AML attacks against the provided models within a black-box scenario; and (iii) evaluating the robustness of the considered models after deploying relevant defense methods. In particular, four typical AML attacks are considered to investigate six ML-based IDSs using three benchmarking datasets. Moreover, multi-class classification scenarios are designed to assess the performance of each attack type. The experiments indicated a drastic drop in detection accuracy for some attempts. To harden the IDS even further, two defense mechanisms were derived from both data-based and model-based methods. Specifically, these methods relied on feature squeezing as well as adversarial training defense strategies. They yielded promising results, enhanced robustness, and maintained standard accuracy in the presence or absence of adversaries. The obtained results proved the efficiency of the proposed framework in robustifying IDS performance within the IoT context. In particular, the accuracy reached 100% for black-box attack scenarios while preserving the accuracy in the absence of attacks as well.

6.
Neural Netw ; 174: 106224, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38479186

RESUMO

Adversarial training has become the mainstream method to boost adversarial robustness of deep models. However, it often suffers from the trade-off dilemma, where the use of adversarial examples hurts the standard generalization of models on natural data. To study this phenomenon, we investigate it from the perspective of spatial attention. In brief, standard training typically encourages a model to conduct a comprehensive check to input space. But adversarial training often causes a model to overly concentrate on sparse spatial regions. This reduced tendency is beneficial to avoid adversarial accumulation but easily makes the model ignore abundant discriminative information, thereby resulting in weak generalization. To address this issue, this paper introduces an Attention-Enhanced Learning Framework (AELF) for robustness training. The main idea is to enable the model to inherit the attention pattern of standard pre-trained model through an embedding-level regularization. To be specific, given a teacher model built on natural examples, the embedding distribution of teacher model is used as a static constraint to regulate the embedding outputs of the objective model. This design is mainly supported with that the embedding feature of standard model is usually recognized as a rich semantic integration of input. For implementation, we present a simplified AELFs that can achieve the regularization with single cross entropy loss via the parameter initialization and parameter update strategy. This avoids the extra consistency comparison operation between embedding vectors. Experimental observations verify the rationality of our argument, and experimental results demonstrate that it can achieve remarkable improvements in generalization under the high-level robustness.


Assuntos
Generalização Psicológica , Aprendizagem , Entropia , Semântica
7.
Entropy (Basel) ; 26(2)2024 Jan 24.
Artigo em Inglês | MEDLINE | ID: mdl-38392358

RESUMO

Despite their remarkable performance, deep learning models still lack robustness guarantees, particularly in the presence of adversarial examples. This significant vulnerability raises concerns about their trustworthiness and hinders their deployment in critical domains that require certified levels of robustness. In this paper, we introduce an information geometric framework to establish precise robustness criteria for l2 white-box attacks in a multi-class classification setting. We endow the output space with the Fisher information metric and derive criteria on the input-output Jacobian to ensure robustness. We show that model robustness can be achieved by constraining the model to be partially isometric around the training points. We evaluate our approach using MNIST and CIFAR-10 datasets against adversarial attacks, revealing its substantial improvements over defensive distillation and Jacobian regularization for medium-sized perturbations and its superior robustness performance to adversarial training for large perturbations, all while maintaining the desired accuracy.

8.
Neural Netw ; 172: 106117, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38232423

RESUMO

Whilst adversarial training has been proven to be one most effective defending method against adversarial attacks for deep neural networks, it suffers from over-fitting on training adversarial data and thus may not guarantee the robust generalization. This may result from the fact that the conventional adversarial training methods generate adversarial perturbations usually in a supervised way so that the resulting adversarial examples are highly biased towards the decision boundary, leading to an inhomogeneous data distribution. To mitigate this limitation, we propose to generate adversarial examples from a perturbation diversity perspective. Specifically, the generated perturbed samples are not only adversarial but also diverse so as to certify robust generalization and significant robustness improvement through a homogeneous data distribution. We provide theoretical and empirical analysis, establishing a foundation to support the proposed method. As a major contribution, we prove that promoting perturbations diversity can lead to a better robust generalization bound. To verify our methods' effectiveness, we conduct extensive experiments over different datasets (e.g., CIFAR-10, CIFAR-100, SVHN) with different adversarial attacks (e.g., PGD, CW). Experimental results show that our method outperforms other state-of-the-art (e.g., PGD and Feature Scattering) in robust generalization performance.


Assuntos
Generalização Psicológica , Redes Neurais de Computação
9.
Neural Netw ; 171: 127-143, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38091756

RESUMO

Recent years have witnessed increasing interest in adversarial attacks on images, while adversarial video attacks have seldom been explored. In this paper, we propose a sparse adversarial attack strategy on videos (DeepSAVA). Our model aims to add a small human-imperceptible perturbation to the key frame of the input video to fool the classifiers. To carry out an effective attack that mirrors real-world scenarios, our algorithm integrates spatial transformation perturbations into the frame. Instead of using the lp norm to gauge the disparity between the perturbed frame and the original frame, we employ the structural similarity index (SSIM), which has been established as a more suitable metric for quantifying image alterations resulting from spatial perturbations. We employ a unified optimisation framework to combine spatial transformation with additive perturbation, thereby attaining a more potent attack. We design an effective and novel optimisation scheme that alternatively utilises Bayesian Optimisation (BO) to identify the most critical frame in a video and stochastic gradient descent (SGD) based optimisation to produce both additive and spatial-transformed perturbations. Doing so enables DeepSAVA to perform a very sparse attack on videos for maintaining human imperceptibility while still achieving state-of-the-art performance in terms of both attack success rate and adversarial transferability. Furthermore, built upon the strong perturbations produced by DeepSAVA, we design a novel adversarial training framework to improve the robustness of video classification models. Our intensive experiments on various types of deep neural networks and video datasets confirm the superiority of DeepSAVA in terms of attacking performance and efficiency. When compared to the baseline techniques, DeepSAVA exhibits the highest level of performance in generating adversarial videos for three distinct video classifiers. Remarkably, it achieves an impressive fooling rate ranging from 99.5% to 100% for the I3D model, with the perturbation of just a single frame. Additionally, DeepSAVA demonstrates favourable transferability across various time series models. The proposed adversarial training strategy is also empirically demonstrated with better performance on training robust video classifiers compared with the state-of-the-art adversarial training with projected gradient descent (PGD) adversary.


Assuntos
Algoritmos , Redes Neurais de Computação , Humanos , Teorema de Bayes , Reconhecimento Psicológico , Fatores de Tempo
10.
Neural Netw ; 172: 106087, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38160621

RESUMO

Deep neural networks (DNNs) are vulnerable to the attacks of adversarial examples, which bring serious security risks to the learning systems. In this paper, we propose a new defense method to improve the adversarial robustness of DNNs based on stochastic neural networks (SNNs), termed as Margin-SNN. The proposed Margin-SNN mainly includes two modules, i.e., feature uncertainty learning module and label embedding module. The first module introduces uncertainty to the latent feature space by giving each sample a distributional representation rather than a fixed point representation, and leverages the advantages of variational information bottleneck method in achieving good intra-class compactness in latent space. The second module develops a label embedding mechanism to take advantage of the semantic information underlying the labels, which maps the labels into the same latent space with the features, in order to capture the similarity between sample and its class centroid, where a penalty term is equipped to elegantly enlarge the margin between different classes for better inter-class separability. Since no adversarial information is introduced, the proposed model can be learned in standard training to improve adversarial robustness, which is much more efficient than adversarial training. Extensive experiments on data sets MNIST, FASHION MNIST, CIFAR10, CIFAR100 and SVHN demonstrate superior defensive ability of the proposed method. Our code is available at https://github.com/humeng24/Margin-SNN.


Assuntos
Aprendizagem , Redes Neurais de Computação , Incerteza , Semântica
11.
Neural Netw ; 167: 706-714, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37729786

RESUMO

Adversarial training is considered one of the most effective methods to improve the adversarial robustness of deep neural networks. Despite the success, it still suffers from unsatisfactory performance and overfitting. Considering the intrinsic mechanism of adversarial training, recent studies adopt the idea of curriculum learning to alleviate overfitting. However, this also introduces new issues, that is, lacking the quantitative criterion for attacks' strength and catastrophic forgetting. To mitigate such issues, we propose the self-paced adversarial training (SPAT), which explicitly builds the learning process of adversarial training based on adversarial examples of the whole dataset. Specifically, our model is first trained with "easy" adversarial examples, and then is continuously enhanced by gradually adding "complex" adversarial examples. This way strengthens the ability to fit "complex" adversarial examples while holding in mind "easy" adversarial samples. To balance adversarial examples between classes, we determine the difficulty of the adversarial examples locally in each class. Notably, this learning paradigm can also be incorporated into other advanced methods for further boosting adversarial robustness. Experimental results show the effectiveness of our proposed model against various attacks on widely-used benchmarks. Especially, on CIFAR100, SPAT provides a boost of 1.7% (relatively 5.4%) in robust accuracy on the PGD10 attack and 3.9% (relatively 7.2%) in natural accuracy for AWP.


Assuntos
Benchmarking , Aprendizagem , Redes Neurais de Computação
12.
Neural Netw ; 167: 266-282, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37666185

RESUMO

Adversarial robustness is considered a required property of deep neural networks. In this study, we discover that adversarially trained models might have significantly different characteristics in terms of margin and smoothness, even though they show similar robustness. Inspired by the observation, we investigate the effect of different regularizers and discover the negative effect of the smoothness regularizer on maximizing the margin. Based on the analyses, we propose a new method called bridged adversarial training that mitigates the negative effect by bridging the gap between clean and adversarial examples. We provide theoretical and empirical evidence that the proposed method provides stable and better robustness, especially for large perturbations.


Assuntos
Redes Neurais de Computação
13.
Front Neurorobot ; 17: 1205370, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37614968

RESUMO

Deep neural networks (DNNs) have been shown to be susceptible to critical vulnerabilities when attacked by adversarial samples. This has prompted the development of attack and defense strategies similar to those used in cyberspace security. The dependence of such strategies on attack and defense mechanisms makes the associated algorithms on both sides appear as closely processes, with the defense method being particularly passive in these processes. Inspired by the dynamic defense approach proposed in cyberspace to address endless arm races, this article defines ensemble quantity, network structure, and smoothing parameters as variable ensemble attributes and proposes a stochastic ensemble strategy based on heterogeneous and redundant sub-models. The proposed method introduces the diversity and randomness characteristic of deep neural networks to alter the fixed correspondence gradient between input and output. The unpredictability and diversity of the gradients make it more difficult for attackers to directly implement white-box attacks, helping to address the extreme transferability and vulnerability of ensemble models under white-box attacks. Experimental comparison of ASR-vs.-distortion curves with different attack scenarios under CIFAR10 preliminarily demonstrates the effectiveness of the proposed method that even the highest-capacity attacker cannot easily outperform the attack success rate associated with the ensemble smoothed model, especially for untargeted attacks.

14.
Neurocomputing (Amst) ; 5512023 Sep 28.
Artigo em Inglês | MEDLINE | ID: mdl-37587916

RESUMO

Adversarial training is the most popular and general strategy to improve Deep Neural Network (DNN) robustness against adversarial noises. Many adversarial training methods have been proposed in the past few years. However, most of these methods are highly susceptible to hyperparameters, especially the training noise upper bound. Tuning these hyperparameters is expensive and difficult for people not in the adversarial robustness research domain, which prevents adversarial training techniques from being used in many application fields. In this study, we propose a new adversarial training method, named Adaptive Margin Evolution (AME). Besides being hyperparameter-free for the user, our AME method places adversarial training samples into the optimal locations in the input space by gradually expanding the exploration range with self-adaptive and gradient-aware step sizes. We evaluate AME and the other seven well-known adversarial training methods on three common benchmark datasets (CIFAR10, SVHN, and Tiny ImageNet) under the most challenging adversarial attack: AutoAttack. The results show that: (1) On the three datasets, AME has the best overall performance; (2) On the Tiny ImageNet dataset, which is much more challenging, AME has the best performance at every noise level. Our work may pave the way for adopting adversarial training techniques in application domains where hyperparameter-free methods are preferred.

15.
Comput Methods Programs Biomed ; 240: 107687, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37392695

RESUMO

BACKGROUND AND OBJECTIVE: Deep neural networks (DNNs) are vulnerable to adversarial noises. Adversarial training is a general and effective strategy to improve DNN robustness (i.e., accuracy on noisy data) against adversarial noises. However, DNN models trained by the current existing adversarial training methods may have much lower standard accuracy (i.e., accuracy on clean data), compared to the same models trained by the standard method on clean data, and this phenomenon is known as the trade-off between accuracy and robustness and is commonly considered unavoidable. This issue prevents adversarial training from being used in many application domains, such as medical image analysis, as practitioners do not want to sacrifice standard accuracy too much in exchange for adversarial robustness. Our objective is to lift (i.e., alleviate or even avoid) this trade-off between standard accuracy and adversarial robustness for medical image classification and segmentation. METHODS: We propose a novel adversarial training method, named Increasing-Margin Adversarial (IMA) Training, which is supported by an equilibrium state analysis about the optimality of adversarial training samples. Our method aims to preserve accuracy while improving robustness by generating optimal adversarial training samples. We evaluate our method and the other eight representative methods on six publicly available image datasets corrupted by noises generated by AutoAttack and white-noise attack. RESULTS: Our method achieves the highest adversarial robustness for image classification and segmentation with the smallest reduction in accuracy on clean data. For one of the applications, our method improves both accuracy and robustness. CONCLUSIONS: Our study has demonstrated that our method can lift the trade-off between standard accuracy and adversarial robustness for the image classification and segmentation applications. To our knowledge, it is the first work to show that the trade-off is avoidable for medical image segmentation.


Assuntos
Processamento de Imagem Assistida por Computador , Redes Neurais de Computação , Processamento de Imagem Assistida por Computador/métodos
16.
Neural Netw ; 165: 164-174, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-37295205

RESUMO

Spiking Neural Network (SNN) has been recognized as the third generation of neural networks. Conventionally, a SNN can be converted from a pre-trained Artificial Neural Network (ANN) with less computation and memory than training from scratch. But, these converted SNNs are vulnerable to adversarial attacks. Numerical experiments demonstrate that the SNN trained by optimizing the loss function will be more adversarial robust, but the theoretical analysis for the mechanism of robustness is lacking. In this paper, we provide a theoretical explanation by analyzing the expected risk function. Starting by modeling the stochastic process introduced by the Poisson encoder, we prove that there is a positive semidefinite regularizer. Perhaps surprisingly, this regularizer can make the gradients of the output with respect to input closer to zero, thus resulting in inherent robustness against adversarial attacks. Extensive experiments on the CIFAR10 and CIFAR100 datasets support our point of view. For example, we find that the sum of squares of the gradients of the converted SNNs is 13∼160 times that of the trained SNNs. And, the smaller the sum of the squares of the gradients, the smaller the degradation of accuracy under adversarial attack.


Assuntos
Redes Neurais de Computação
17.
Comput Biol Med ; 157: 106791, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-36958234

RESUMO

Convolutional Neural Networks (CNNs) have advanced existing medical systems for automatic disease diagnosis. However, there are still concerns about the reliability of deep medical diagnosis systems against the potential threats of adversarial attacks since inaccurate diagnosis could lead to disastrous consequences in the safety realm. In this study, we propose a highly robust yet efficient CNN-Transformer hybrid model which is equipped with the locality of CNNs as well as the global connectivity of vision Transformers. To mitigate the high quadratic complexity of the self-attention mechanism while jointly attending to information in various representation subspaces, we construct our attention mechanism by means of an efficient convolution operation. Moreover, to alleviate the fragility of our Transformer model against adversarial attacks, we attempt to learn smoother decision boundaries. To this end, we augment the shape information of an image in the high-level feature space by permuting the feature mean and variance within mini-batches. With less computational complexity, our proposed hybrid model demonstrates its high robustness and generalization ability compared to the state-of-the-art studies on a large-scale collection of standardized MedMNIST-2D datasets.


Assuntos
Aprendizagem , Redes Neurais de Computação , Reprodutibilidade dos Testes
18.
IEEE Access ; 10: 58071-58080, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36339794

RESUMO

Neurons in the brain are complex machines with distinct functional compartments that interact nonlinearly. In contrast, neurons in artificial neural networks abstract away this complexity, typically down to a scalar activation function of a weighted sum of inputs. Here we emulate more biologically realistic neurons by learning canonical activation functions with two input arguments, analogous to basal and apical dendrites. We use a network-in-network architecture where each neuron is modeled as a multilayer perceptron with two inputs and a single output. This inner perceptron is shared by all units in the outer network. Remarkably, the resultant nonlinearities often produce soft XOR functions, consistent with recent experimental observations about interactions between inputs in human cortical neurons. When hyperparameters are optimized, networks with these nonlinearities learn faster and perform better than conventional ReLU nonlinearities with matched parameter counts, and they are more robust to natural and adversarial perturbations.

19.
Front Artif Intell ; 5: 952773, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36262462

RESUMO

Remarkable progress in the fields of machine learning (ML) and artificial intelligence (AI) has led to an increased number of applications of (data-driven) AI systems for the partial or complete control of safety-critical systems. Recently, ML solutions have been particularly popular. Such approaches are often met with concerns regarding their correct and safe execution, which is often caused by missing knowledge or intransparency of their exact functionality. The investigation and derivation of methods for the safety assessment of AI systems are thus of great importance. Among others, these issues are addressed in the field of AI Safety. The aim of this work is to provide an overview of this field by means of a systematic literature review with special focus on the area of highly automated driving, as well as to present a selection of approaches and methods for the safety assessment of AI systems. Particularly, validation, verification, and testing are considered in light of this context. In the review process, two distinguished classes of approaches have been identified: On the one hand established methods, either referring to already published standards or well-established concepts from multiple research areas outside ML and AI. On the other hand newly developed approaches, including methods tailored to the scope of ML and AI which gained importance only in recent years.

20.
Neural Netw ; 155: 177-203, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-36058022

RESUMO

Convolutional Neural Network is one of the famous members of the deep learning family of neural network architectures, which is used for many purposes, including image classification. In spite of the wide adoption, such networks are known to be highly tuned to the training data (samples representing a particular problem), and they are poorly reusable to address new problems. One way to change this would be, in addition to trainable weights, to apply trainable parameters of the mathematical functions, which simulate various neural computations within such networks. In this way, we may distinguish between the narrowly focused task-specific parameters (weights) and more generic capability-specific parameters. In this paper, we suggest a couple of flexible mathematical functions (Generalized Lehmer Mean and Generalized Power Mean) with trainable parameters to replace some fixed operations (such as ordinary arithmetic mean or simple weighted aggregation), which are traditionally used within various components of a convolutional neural network architecture. We named the overall architecture with such an update as a hyper-flexible convolutional neural network. We provide mathematical justification of various components of such architecture and experimentally show that it performs better than the traditional one, including better robustness regarding the adversarial perturbations of testing data.


Assuntos
Aprendizado de Máquina , Redes Neurais de Computação
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA