Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 62
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Neural Netw ; 178: 106493, 2024 Jun 27.
Artigo em Inglês | MEDLINE | ID: mdl-38970946

RESUMO

Visual object tracking, which is primarily based on visible light image sequences, encounters numerous challenges in complicated scenarios, such as low light conditions, high dynamic ranges, and background clutter. To address these challenges, incorporating the advantages of multiple visual modalities is a promising solution for achieving reliable object tracking. However, the existing approaches usually integrate multimodal inputs through adaptive local feature interactions, which cannot leverage the full potential of visual cues, thus resulting in insufficient feature modeling. In this study, we propose a novel multimodal hybrid tracker (MMHT) that utilizes frame-event-based data for reliable single object tracking. The MMHT model employs a hybrid backbone consisting of an artificial neural network (ANN) and a spiking neural network (SNN) to extract dominant features from different visual modalities and then uses a unified encoder to align the features across different domains. Moreover, we propose an enhanced transformer-based module to fuse multimodal features using attention mechanisms. With these methods, the MMHT model can effectively construct a multiscale and multidimensional visual feature space and achieve discriminative feature modeling. Extensive experiments demonstrate that the MMHT model exhibits competitive performance in comparison with that of other state-of-the-art methods. Overall, our results highlight the effectiveness of the MMHT model in terms of addressing the challenges faced in visual object tracking tasks.

2.
Artigo em Inglês | MEDLINE | ID: mdl-38833393

RESUMO

Sensory information recognition is primarily processed through the ventral and dorsal visual pathways in the primate brain visual system, which exhibits layered feature representations bearing a strong resemblance to convolutional neural networks (CNNs), encompassing reconstruction and classification. However, existing studies often treat these pathways as distinct entities, focusing individually on pattern reconstruction or classification tasks, overlooking a key feature of biological neurons, the fundamental units for neural computation of visual sensory information. Addressing these limitations, we introduce a unified framework for sensory information recognition with augmented spikes. By integrating pattern reconstruction and classification within a single framework, our approach not only accurately reconstructs multimodal sensory information but also provides precise classification through definitive labeling. Experimental evaluations conducted on various datasets including video scenes, static images, dynamic auditory scenes, and functional magnetic resonance imaging (fMRI) brain activities demonstrate that our framework delivers state-of-the-art pattern reconstruction quality and classification accuracy. The proposed framework enhances the biological realism of multimodal pattern recognition models, offering insights into how the primate brain visual system effectively accomplishes the reconstruction and classification tasks through the integration of ventral and dorsal pathways.

3.
Natl Sci Rev ; 11(5): nwae102, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38689713

RESUMO

Spiking neural networks (SNNs) are gaining increasing attention for their biological plausibility and potential for improved computational efficiency. To match the high spatial-temporal dynamics in SNNs, neuromorphic chips are highly desired to execute SNNs in hardware-based neuron and synapse circuits directly. This paper presents a large-scale neuromorphic chip named Darwin3 with a novel instruction set architecture, which comprises 10 primary instructions and a few extended instructions. It supports flexible neuron model programming and local learning rule designs. The Darwin3 chip architecture is designed in a mesh of computing nodes with an innovative routing algorithm. We used a compression mechanism to represent synaptic connections, significantly reducing memory usage. The Darwin3 chip supports up to 2.35 million neurons, making it the largest of its kind on the neuron scale. The experimental results showed that the code density was improved by up to 28.3× in Darwin3, and that the neuron core fan-in and fan-out were improved by up to 4096× and 3072× by connection compression compared to the physical memory depth. Our Darwin3 chip also provided memory saving between 6.8× and 200.8× when mapping convolutional spiking neural networks onto the chip, demonstrating state-of-the-art performance in accuracy and latency compared to other neuromorphic chips.

4.
Commun Biol ; 7(1): 487, 2024 Apr 22.
Artigo em Inglês | MEDLINE | ID: mdl-38649503

RESUMO

The phenomenon of semantic satiation, which refers to the loss of meaning of a word or phrase after being repeated many times, is a well-known psychological phenomenon. However, the microscopic neural computational principles responsible for these mechanisms remain unknown. In this study, we use a deep learning model of continuous coupled neural networks to investigate the mechanism underlying semantic satiation and precisely describe this process with neuronal components. Our results suggest that, from a mesoscopic perspective, semantic satiation may be a bottom-up process. Unlike existing macroscopic psychological studies that suggest that semantic satiation is a top-down process, our simulations use a similar experimental paradigm as classical psychology experiments and observe similar results. Satiation of semantic objectives, similar to the learning process of our network model used for object recognition, relies on continuous learning and switching between objects. The underlying neural coupling strengthens or weakens satiation. Taken together, both neural and network mechanisms play a role in controlling semantic satiation.


Assuntos
Aprendizado Profundo , Semântica , Humanos , Redes Neurais de Computação , Modelos Neurológicos
5.
Artigo em Inglês | MEDLINE | ID: mdl-38498737

RESUMO

Spiking Neural Networks (SNNs) have attracted significant attention for their energy-efficient and brain-inspired event-driven properties. Recent advancements, notably Spiking-YOLO, have enabled SNNs to undertake advanced object detection tasks. Nevertheless, these methods often suffer from increased latency and diminished detection accuracy, rendering them less suitable for latency-sensitive mobile platforms. Additionally, the conversion of artificial neural networks (ANNs) to SNNs frequently compromises the integrity of the ANNs' structure, resulting in poor feature representation and heightened conversion errors. To address the issues of high latency and low detection accuracy, we introduce two solutions: timestep compression and spike-time-dependent integrated (STDI) coding. Timestep compression effectively reduces the number of timesteps required in the ANN-to-SNN conversion by condensing information. The STDI coding employs a time-varying threshold to augment information capacity. Furthermore, we have developed an SNN-based spatial pyramid pooling (SPP) structure, optimized to preserve the network's structural efficacy during conversion. Utilizing these approaches, we present the ultralow latency and highly accurate object detection model, SUHD. SUHD exhibits exceptional performance on challenging datasets like PASCAL VOC and MS COCO, achieving a remarkable reduction of approximately 750 times in timesteps and a 30% enhancement in mean average precision (mAP) compared to Spiking-YOLO on MS COCO. To the best of our knowledge, SUHD is currently the deepest spike-based object detection model, achieving ultralow timesteps for lossless conversion.

6.
Neural Netw ; 172: 106092, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38211460

RESUMO

Spiking neural networks (SNNs) are considered an attractive option for edge-side applications due to their sparse, asynchronous and event-driven characteristics. However, the application of SNNs to object detection tasks faces challenges in achieving good detection accuracy and high detection speed. To overcome the aforementioned challenges, we propose an end-to-end Trainable Spiking-YOLO (Tr-Spiking-YOLO) for low-latency and high-performance object detection. We evaluate our model on not only frame-based PASCAL VOC dataset but also event-based GEN1 Automotive Detection dataset, and investigate the impacts of different decoding methods on detection performance. The experimental results show that our model achieves competitive/better performance in terms of accuracy, latency and energy consumption compared to similar artificial neural network (ANN) and conversion-based SNN object detection model. Furthermore, when deployed on an edge device, our model achieves a processing speed of approximately from 14 to 39 FPS while maintaining a desirable mean Average Precision (mAP), which is capable of real-time detection on resource-constrained platforms.


Assuntos
Redes Neurais de Computação
7.
Artigo em Inglês | MEDLINE | ID: mdl-38100345

RESUMO

Spiking neural networks (SNNs) operating with asynchronous discrete events show higher energy efficiency with sparse computation. A popular approach for implementing deep SNNs is artificial neural network (ANN)-SNN conversion combining both efficient training of ANNs and efficient inference of SNNs. However, the accuracy loss is usually nonnegligible, especially under few time steps, which restricts the applications of SNN on latency-sensitive edge devices greatly. In this article, we first identify that such performance degradation stems from the misrepresentation of the negative or overflow residual membrane potential in SNNs. Inspired by this, we decompose the conversion error into three parts: quantization error, clipping error, and residual membrane potential representation error. With such insights, we propose a two-stage conversion algorithm to minimize those errors, respectively. In addition, we show that each stage achieves significant performance gains in a complementary manner. By evaluating on challenging datasets including CIFAR-10, CIFAR-100, and ImageNet, the proposed method demonstrates the state-of-the-art performance in terms of accuracy, latency, and energy preservation. Furthermore, our method is evaluated using a more challenging object detection task, revealing notable gains in regression performance under ultralow latency, when compared with existing spike-based detection algorithms. Codes will be available at: https://github.com/Windere/snn-cvt-dual-phase.

8.
Patterns (N Y) ; 4(10): 100831, 2023 Oct 13.
Artigo em Inglês | MEDLINE | ID: mdl-37876899

RESUMO

Networks of spiking neurons underpin the extraordinary information-processing capabilities of the brain and have become pillar models in neuromorphic artificial intelligence. Despite extensive research on spiking neural networks (SNNs), most studies are established on deterministic models, overlooking the inherent non-deterministic, noisy nature of neural computations. This study introduces the noisy SNN (NSNN) and the noise-driven learning (NDL) rule by incorporating noisy neuronal dynamics to exploit the computational advantages of noisy neural processing. The NSNN provides a theoretical framework that yields scalable, flexible, and reliable computation and learning. We demonstrate that this framework leads to spiking neural models with competitive performance, improved robustness against challenging perturbations compared with deterministic SNNs, and better reproducing probabilistic computation in neural coding. Generally, this study offers a powerful and easy-to-use tool for machine learning, neuromorphic intelligence practitioners, and computational neuroscience researchers.

9.
Artigo em Inglês | MEDLINE | ID: mdl-37651489

RESUMO

Traditional spiking learning algorithm aims to train neurons to spike at a specific time or on a particular frequency, which requires precise time and frequency labels in the training process. While in reality, usually only aggregated labels of sequential patterns are provided. The aggregate-label (AL) learning is proposed to discover these predictive features in distracting background streams only by aggregated spikes. It has achieved much success recently, but it is still computationally intensive and has limited use in deep networks. To address these issues, we propose an event-driven spiking aggregate learning algorithm (SALA) in this article. Specifically, to reduce the computational complexity, we improve the conventional spike-threshold-surface (STS) calculation in AL learning by analytical calculating voltage peak values in spiking neurons. Then we derive the algorithm to multilayers by event-driven strategy using aggregated spikes. We conduct comprehensive experiments on various tasks including temporal clue recognition, segmented and continuous speech recognition, and neuromorphic image classification. The experimental results demonstrate that the new STS method improves the efficiency of AL learning significantly, and the proposed algorithm outperforms the conventional spiking algorithm in various temporal clue recognition tasks.

10.
Neural Netw ; 166: 174-187, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37494763

RESUMO

Experience replay (ER) is a widely-adopted neuroscience-inspired method to perform lifelong learning. Nonetheless, existing ER-based approaches consider very coarse memory modules with simple memory and rehearsal mechanisms that cannot fully exploit the potential of memory replay. Evidence from neuroscience has provided fine-grained memory and rehearsal mechanisms, such as the dual-store memory system consisting of PFC-HC circuits. However, the computational abstraction of these processes is still very challenging. To address these problems, we introduce the Dual-Memory (Dual-MEM) model emulating the memorization, consolidation, and rehearsal process in the PFC-HC dual-store memory circuit. Dual-MEM maintains an incrementally updated short-term memory to benefit current-task learning. At the end of the current task, short-term memories will be consolidated into long-term ones for future rehearsal to alleviate forgetting. For the Dual-MEM optimization, we propose two learning policies that emulate different memory retrieval strategies: Direct Retrieval Learning and Mixup Retrieval Learning. Extensive evaluations on eight benchmarks demonstrate that Dual-MEM delivers compelling performance while maintaining high learning and memory utilization efficiencies under the challenging experience-once setting.


Assuntos
Aprendizagem , Memória de Curto Prazo , Educação Continuada , Formação de Conceito
11.
Comput Biol Med ; 163: 107114, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37329620

RESUMO

To navigate in space, it is important to predict headings in real-time from neural responses in the brain to vestibular and visual signals, and the ventral intraparietal area (VIP) is one of the critical brain areas. However, it remains unexplored in the population level how the heading perception is represented in VIP. And there are no commonly used methods suitable for decoding the headings from the population responses in VIP, given the large spatiotemporal dynamics and heterogeneity in the neural responses. Here, responses were recorded from 210 VIP neurons in three rhesus monkeys when they were performing a heading perception task. And by specifically and separately modelling the both dynamics with sparse representation, we built a sequential sparse autoencoder (SSAE) to do the population decoding on the recorded dataset and tried to maximize the decoding performance. The SSAE relies on a three-layer sparse autoencoder to extract temporal and spatial heading features in the dataset via unsupervised learning, and a softmax classifier to decode the headings. Compared with other population decoding methods, the SSAE achieves a leading accuracy of 96.8% ± 2.1%, and shows the advantages of robustness, low storage and computing burden for real-time prediction. Therefore, our SSAE model performs well in learning neurobiologically plausible features comprising dynamic navigational information.


Assuntos
Movimentos Oculares , Percepção de Movimento , Animais , Lobo Parietal/fisiologia , Percepção de Movimento/fisiologia , Estimulação Luminosa/métodos , Encéfalo , Macaca mulatta
12.
Front Neurosci ; 17: 1204334, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37260839

RESUMO

[This corrects the article DOI: 10.3389/fnins.2023.1123698.].

13.
Artigo em Inglês | MEDLINE | ID: mdl-37030679

RESUMO

A large quantity of labeled data is required to train high-performance deep spiking neural networks (SNNs), but obtaining labeled data is expensive. Active learning is proposed to reduce the quantity of labeled data required by deep learning models. However, conventional active learning methods in SNNs are not as effective as that in conventional artificial neural networks (ANNs) because of the difference in feature representation and information transmission. To address this issue, we propose an effective active learning method for a deep SNN model in this article. Specifically, a loss prediction module ActiveLossNet is proposed to extract features and select valuable samples for deep SNNs. Then, we derive the corresponding active learning algorithm for deep SNN models. Comprehensive experiments are conducted on CIFAR-10, MNIST, Fashion-MNIST, and SVHN on different SNN frameworks, including seven-layer CIFARNet and 20-layer ResNet-18. The comparison results demonstrate that the proposed active learning algorithm outperforms random selection and conventional ANN active learning methods. In addition, our method converges faster than conventional active learning methods.

14.
Artigo em Inglês | MEDLINE | ID: mdl-37022405

RESUMO

The temporal credit assignment (TCA) problem, which aims to detect predictive features hidden in distracting background streams, remains a core challenge in biological and machine learning. Aggregate-label (AL) learning is proposed by researchers to resolve this problem by matching spikes with delayed feedback. However, the existing AL learning algorithms only consider the information of a single timestep, which is inconsistent with the real situation. Meanwhile, there is no quantitative evaluation method for TCA problems. To address these limitations, we propose a novel attention-based TCA (ATCA) algorithm and a minimum editing distance (MED)-based quantitative evaluation method. Specifically, we define a loss function based on the attention mechanism to deal with the information contained within the spike clusters and use MED to evaluate the similarity between the spike train and the target clue flow. Experimental results on musical instrument recognition (MedleyDB), speech recognition (TIDIGITS), and gesture recognition (DVS128-Gesture) show that the ATCA algorithm can reach the state-of-the-art (SOTA) level compared with other AL learning algorithms.

15.
Front Neurosci ; 17: 1123698, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36875665

RESUMO

Event cameras are asynchronous and neuromorphically inspired visual sensors, which have shown great potential in object tracking because they can easily detect moving objects. Since event cameras output discrete events, they are inherently suitable to coordinate with Spiking Neural Network (SNN), which has a unique event-driven computation characteristic and energy-efficient computing. In this paper, we tackle the problem of event-based object tracking by a novel architecture with a discriminatively trained SNN, called the Spiking Convolutional Tracking Network (SCTN). Taking a segment of events as input, SCTN not only better exploits implicit associations among events rather than event-wise processing, but also fully utilizes precise temporal information and maintains the sparse representation in segments instead of frames. To make SCTN more suitable for object tracking, we propose a new loss function that introduces an exponential Intersection over Union (IoU) in the voltage domain. To the best of our knowledge, this is the first tracking network directly trained with SNN. Besides, we present a new event-based tracking dataset, dubbed DVSOT21. In contrast to other competing trackers, experimental results on DVSOT21 demonstrate that our method achieves competitive performance with very low energy consumption compared to ANN based trackers with very low energy consumption compared to ANN based trackers. With lower energy consumption, tracking on neuromorphic hardware will reveal its advantage.

16.
Cereb Cortex ; 33(11): 6772-6784, 2023 05 24.
Artigo em Inglês | MEDLINE | ID: mdl-36734278

RESUMO

Gaze change can misalign spatial reference frames encoding visual and vestibular signals in cortex, which may affect the heading discrimination. Here, by systematically manipulating the eye-in-head and head-on-body positions to change the gaze direction of subjects, the performance of heading discrimination was tested with visual, vestibular, and combined stimuli in a reaction-time task in which the reaction time is under the control of subjects. We found the gaze change induced substantial biases in perceived heading, increased the threshold of discrimination and reaction time of subjects in all stimulus conditions. For the visual stimulus, the gaze effects were induced by changing the eye-in-world position, and the perceived heading was biased in the opposite direction of gaze. In contrast, the vestibular gaze effects were induced by changing the eye-in-head position, and the perceived heading was biased in the same direction of gaze. Although the bias was reduced when the visual and vestibular stimuli were combined, integration of the 2 signals substantially deviated from predictions of an extended diffusion model that accumulates evidence optimally over time and across sensory modalities. These findings reveal diverse gaze effects on the heading discrimination and emphasize that the transformation of spatial reference frames may underlie the effects.


Assuntos
Percepção de Movimento , Vestíbulo do Labirinto , Humanos , Tempo de Reação , Córtex Cerebral , Viés , Percepção Visual , Estimulação Luminosa
17.
IEEE Trans Neural Netw Learn Syst ; 34(8): 5200-5205, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-34723807

RESUMO

Spiking neural networks (SNNs) have received significant attention for their biological plausibility. SNNs theoretically have at least the same computational power as traditional artificial neural networks (ANNs). They possess the potential of achieving energy-efficient machine intelligence while keeping comparable performance to ANNs. However, it is still a big challenge to train a very deep SNN. In this brief, we propose an efficient approach to build deep SNNs. Residual network (ResNet) is considered a state-of-the-art and fundamental model among convolutional neural networks (CNNs). We employ the idea of converting a trained ResNet to a network of spiking neurons named spiking ResNet (S-ResNet). We propose a residual conversion model that appropriately scales continuous-valued activations in ANNs to match the firing rates in SNNs and a compensation mechanism to reduce the error caused by discretization. Experimental results demonstrate that our proposed method achieves state-of-the-art performance on CIFAR-10, CIFAR-100, and ImageNet 2012 with low latency. This work is the first time to build an asynchronous SNN deeper than 100 layers, with comparable performance to its original ANN.

18.
IEEE Trans Neural Netw Learn Syst ; 34(11): 9040-9053, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-35298385

RESUMO

Neural architecture search (NAS) has attracted much attention in recent years. It automates the neural network construction for different tasks, which is traditionally addressed manually. In the literature, evolutionary optimization (EO) has been proposed for NAS due to its strong global search capability. However, despite the success enjoyed by EO, it is worth noting that existing EO algorithms for NAS are often very computationally expensive, which makes these algorithms unpractical in reality. Keeping this in mind, in this article, we propose an efficient memetic algorithm (MA) for automated convolutional neural network (CNN) architecture search. In contrast to existing EO algorithms for CNN architecture design, a new cell-based architecture search space, and new global and local search operators are proposed for CNN architecture search. To further improve the efficiency of our proposed algorithm, we develop a one-epoch-based performance estimation strategy without any pretrained models to evaluate each found architecture on the training datasets. To investigate the performance of the proposed method, comprehensive empirical studies are conducted against 34 state-of-the-art peer algorithms, including manual algorithms, reinforcement learning (RL) algorithms, gradient-based algorithms, and evolutionary algorithms (EAs), on widely used CIFAR10 and CIFAR100 datasets. The obtained results confirmed the efficacy of the proposed approach for automated CNN architecture design.

19.
IEEE Trans Cybern ; 53(11): 7187-7198, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-36063509

RESUMO

As the third-generation neural networks, spiking neural networks (SNNs) have great potential on neuromorphic hardware because of their high energy efficiency. However, deep spiking reinforcement learning (DSRL), that is, the reinforcement learning (RL) based on SNNs, is still in its preliminary stage due to the binary output and the nondifferentiable property of the spiking function. To address these issues, we propose a deep spiking Q -network (DSQN) in this article. Specifically, we propose a directly trained DSRL architecture based on the leaky integrate-and-fire (LIF) neurons and deep Q -network (DQN). Then, we adapt a direct spiking learning algorithm for the DSQN. We further demonstrate the advantages of using LIF neurons in DSQN theoretically. Comprehensive experiments have been conducted on 17 top-performing Atari games to compare our method with the state-of-the-art conversion method. The experimental results demonstrate the superiority of our method in terms of performance, stability, generalization and energy efficiency. To the best of our knowledge, our work is the first one to achieve state-of-the-art performance on multiple Atari games with the directly trained SNN.

20.
Neural Netw ; 154: 543-559, 2022 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-35995020

RESUMO

Event cameras sense changes in light intensity and record them as an asynchronous event stream. Efficiently encoding and learning spatiotemporal information of the event streams remain challenging. In this paper, we propose a novel event descriptor to encode the spatio-temporal features for event streams and a local-search based multi-spike learning algorithm for spiking neural networks to classify encoded features. The spatio-temporal event surface (STES) descriptor explicitly captures both spatial and temporal correlations among events, and thus can characterize spatiotemporal features more accurately than existing feature descriptors that focus only on temporal or spatial information. In classification with multi-spike learning, we introduce a local search and gradient clipping mechanism to ensure the efficiency and stability of learning, which avoids other multi-spike learning rules' time-consuming global search and the gradient explosion problem. Experimental results demonstrate the superior classification performance of our proposed model, especially for event streams with rich spatiotemporal dynamics.


Assuntos
Modelos Neurológicos , Neurônios , Potenciais de Ação , Algoritmos , Redes Neurais de Computação
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...