Search | VHL Regional Portal

1.

Spike-based dynamic computing with asynchronous sensing-computing neuromorphic chip.

Yao, Man; Richter, Ole; Zhao, Guangshe; Qiao, Ning; Xing, Yannan; Wang, Dingheng; Hu, Tianxiang; Fang, Wei; Demirci, Tugba; De Marchi, Michele; Deng, Lei; Yan, Tianyi; Nielsen, Carsten; Sheik, Sadique; Wu, Chenxi; Tian, Yonghong; Xu, Bo; Li, Guoqi.

Nat Commun ; 15(1): 4464, 2024 May 25.

Article in English | MEDLINE | ID: mdl-38796464

ABSTRACT

By mimicking the neurons and synapses of the human brain and employing spiking neural networks on neuromorphic chips, neuromorphic computing offers a promising energy-efficient machine intelligence. How to borrow high-level brain dynamic mechanisms to help neuromorphic computing achieve energy advantages is a fundamental issue. This work presents an application-oriented algorithm-software-hardware co-designed neuromorphic system for this issue. First, we design and fabricate an asynchronous chip called "Speck", a sensing-computing neuromorphic system on chip. With the low processor resting power of 0.42mW, Speck can satisfy the hardware requirements of dynamic computing: no-input consumes no energy. Second, we uncover the "dynamic imbalance" in spiking neural networks and develop an attention-based framework for achieving the algorithmic requirements of dynamic computing: varied inputs consume energy with large variance. Together, we demonstrate a neuromorphic system with real-time power as low as 0.70mW. This work exhibits the promising potentials of neuromorphic computing with its asynchronous event-driven, sparse, and dynamic nature.

Subject(s)

Algorithms , Neural Networks, Computer , Neurons , Humans , Neurons/physiology , Models, Neurological , Action Potentials/physiology , Synapses/physiology , Brain/physiology , Software

2.

Sparser spiking activity can be better: Feature Refine-and-Mask spiking neural network for event-based visual recognition.

Yao, Man; Zhang, Hengyu; Zhao, Guangshe; Zhang, Xiyu; Wang, Dingheng; Cao, Gang; Li, Guoqi.

Neural Netw ; 166: 410-423, 2023 Sep.

Article in English | MEDLINE | ID: mdl-37549609

ABSTRACT

Event-based visual, a new visual paradigm with bio-inspired dynamic perception and µs level temporal resolution, has prominent advantages in many specific visual scenarios and gained much research interest. Spiking neural network (SNN) is naturally suitable for dealing with event streams due to its temporal information processing capability and event-driven nature. However, existing works SNN neglect the fact that the input event streams are spatially sparse and temporally non-uniform, and just treat these variant inputs equally. This situation interferes with the effectiveness and efficiency of existing SNNs. In this paper, we propose the feature Refine-and-Mask SNN (RM-SNN), which has the ability of self-adaption to regulate the spiking response in a data-dependent way. We use the Refine-and-Mask (RM) module to refine all features and mask the unimportant features to optimize the membrane potential of spiking neurons, which in turn drops the spiking activity. Inspired by the fact that not all events in spatio-temporal streams are task-relevant, we execute the RM module in both temporal and channel dimensions. Extensive experiments on seven event-based benchmarks, DVS128 Gesture, DVS128 Gait, CIFAR10-DVS, N-Caltech101, DailyAction-DVS, UCF101-DVS, and HMDB51-DVS demonstrate that under the multi-scale constraints of input time window, RM-SNN can significantly reduce the network average spiking activity rate while improving the task performance. In addition, by visualizing spiking responses, we analyze why sparser spiking activity can be better. Code.

Subject(s)

Neural Networks, Computer , Time Perception , Action Potentials/physiology , Recognition, Psychology , Neurons/physiology

3.

Attention Spiking Neural Networks.

Yao, Man; Zhao, Guangshe; Zhang, Hengyu; Hu, Yifan; Deng, Lei; Tian, Yonghong; Xu, Bo; Li, Guoqi.

IEEE Trans Pattern Anal Mach Intell ; 45(8): 9393-9410, 2023 Aug.

Article in English | MEDLINE | ID: mdl-37022261

ABSTRACT

Brain-inspired spiking neural networks (SNNs) are becoming a promising energy-efficient alternative to traditional artificial neural networks (ANNs). However, the performance gap between SNNs and ANNs has been a significant hindrance to deploying SNNs ubiquitously. To leverage the full potential of SNNs, in this paper we study the attention mechanisms, which can help human focus on important information. We present our idea of attention in SNNs with a multi-dimensional attention module, which infers attention weights along the temporal, channel, as well as spatial dimension separately or simultaneously. Based on the existing neuroscience theories, we exploit the attention weights to optimize membrane potentials, which in turn regulate the spiking response. Extensive experimental results on event-based action recognition and image classification datasets demonstrate that attention facilitates vanilla SNNs to achieve sparser spiking firing, better performance, and energy efficiency concurrently. In particular, we achieve top-1 accuracy of 75.92% and 77.08% on ImageNet-1 K with single/4-step Res-SNN-104, which are state-of-the-art results in SNNs. Compared with counterpart Res-ANN-104, the performance gap becomes -0.95/+0.21 percent and the energy efficiency is 31.8×/7.4×. To analyze the effectiveness of attention SNNs, we theoretically prove that the spiking degradation or the gradient vanishing, which usually holds in general SNNs, can be resolved by introducing the block dynamical isometry theory. We also analyze the efficiency of attention SNNs based on our proposed spiking response visualization method. Our work lights up SNN's potential as a general backbone to support various applications in the field of SNN research, with a great balance between effectiveness and energy efficiency.

Subject(s)

Algorithms , Neurons , Humans , Neural Networks, Computer , Brain

4.

Kronecker CP Decomposition With Fast Multiplication for Compressing RNNs.

Wang, Dingheng; Wu, Bijiao; Zhao, Guangshe; Yao, Man; Chen, Hengnu; Deng, Lei; Yan, Tianyi; Li, Guoqi.

IEEE Trans Neural Netw Learn Syst ; 34(5): 2205-2219, 2023 May.

Article in English | MEDLINE | ID: mdl-34534089

ABSTRACT

Recurrent neural networks (RNNs) are powerful in the tasks oriented to sequential data, such as natural language processing and video recognition. However, because the modern RNNs have complex topologies and expensive space/computation complexity, compressing them becomes a hot and promising topic in recent years. Among plenty of compression methods, tensor decomposition, e.g., tensor train (TT), block term (BT), tensor ring (TR), and hierarchical Tucker (HT), appears to be the most amazing approach because a very high compression ratio might be obtained. Nevertheless, none of these tensor decomposition formats can provide both space and computation efficiency. In this article, we consider to compress RNNs based on a novel Kronecker CANDECOMP/PARAFAC (KCP) decomposition, which is derived from Kronecker tensor (KT) decomposition, by proposing two fast algorithms of multiplication between the input and the tensor-decomposed weight. According to our experiments based on UCF11, Youtube Celebrities Face, UCF50, TIMIT, TED-LIUM, and Spiking Heidelberg digits datasets, it can be verified that the proposed KCP-RNNs have a comparable performance of accuracy with those in other tensor-decomposed formats, and even 278 219× compression ratio could be obtained by the low-rank KCP. More importantly, KCP-RNNs are efficient in both space and computation complexity compared with other tensor-decomposed ones. Besides, we find KCP has the best potential of parallel computing to accelerate the calculations in neural networks.

5.

Modeling learnable electrical synapse for high precision spatio-temporal recognition.

Wu, Zhenzhi; Zhang, Zhihong; Gao, Huanhuan; Qin, Jun; Zhao, Rongzhen; Zhao, Guangshe; Li, Guoqi.

Neural Netw ; 149: 184-194, 2022 May.

Article in English | MEDLINE | ID: mdl-35248808

ABSTRACT

Bio-inspired recipes are being introduced to artificial neural networks for the efficient processing of spatio-temporal tasks. Among them, Leaky Integrate and Fire (LIF) model is the most remarkable one thanks to its temporal processing capability, lightweight model structure, and well investigated direct training methods. However, most learnable LIF networks generally take neurons as independent individuals that communicate via chemical synapses, leaving electrical synapses all behind. On the contrary, it has been well investigated in biological neural networks that the inter-neuron electrical synapse takes a great effect on the coordination and synchronization of generating action potentials. In this work, we are engaged in modeling such electrical synapses in artificial LIF neurons, where membrane potentials propagate to neighbor neurons via convolution operations, and the refined neural model ECLIF is proposed. We then build deep networks using ECLIF and trained them using a back-propagation-through-time algorithm. We found that the proposed network has great accuracy improvement over traditional LIF on five datasets and achieves high accuracy on them. In conclusion, it reveals that the introduction of the electrical synapse is an important factor for achieving high accuracy on realistic spatio-temporal tasks.

Subject(s)

Electrical Synapses , Models, Neurological , Action Potentials/physiology , Humans , Neural Networks, Computer , Neurons/physiology , Synapses/physiology

6.

Nonlinear tensor train format for deep neural network compression.

Wang, Dingheng; Zhao, Guangshe; Chen, Hengnu; Liu, Zhexian; Deng, Lei; Li, Guoqi.

Neural Netw ; 144: 320-333, 2021 Dec.

Article in English | MEDLINE | ID: mdl-34547670

ABSTRACT

Deep neural network (DNN) compression has become a hot topic in the research of deep learning since the scale of modern DNNs turns into too huge to implement on practical resource constrained platforms such as embedded devices. Among variant compression methods, tensor decomposition appears to be a relatively simple and efficient strategy owing to its solid mathematical foundations and regular data structure. Generally, tensorizing neural weights into higher-order tensors for better decomposition, and directly mapping efficient tensor structure to neural architecture with nonlinear activation functions, are the two most common ways. However, the considerable accuracy loss is still a fly in the ointment for the tensorizing way especially for convolutional neural networks (CNNs), while the number of studies in the mapping way is comparatively limited and corresponding compression ratio appears to be not considerable. Therefore, in this work, by researching multiple types of tensor decompositions, we realize that tensor train (TT), which has specific and efficient sequenced contractions, is potential to take into account both of tensorizing and mapping ways. Then we propose a novel nonlinear tensor train (NTT) format, which contains extra nonlinear activation functions embedded in sequenced contractions and convolutions on the top of the normal TT decomposition and the proposed TT format connected by convolutions, to compensate the accuracy loss that normal TT cannot give. Further than just shrinking the space complexity of original weight matrices and convolutional kernels, we prove that NTT can afford an efficient inference time as well. Extensive experiments and discussions demonstrate that the compressed DNNs in our NTT format can almost maintain the accuracy at least on MNIST, UCF11 and CIFAR-10 datasets, and the accuracy loss caused by normal TT could be compensated significantly on large-scale datasets such as ImageNet.

Subject(s)

Data Compression , Neural Networks, Computer , Algorithms , Physical Phenomena

7.

QTTNet: Quantized tensor train neural networks for 3D object and video recognition.

Lee, Donghyun; Wang, Dingheng; Yang, Yukuan; Deng, Lei; Zhao, Guangshe; Li, Guoqi.

Neural Netw ; 141: 420-432, 2021 Sep.

Article in English | MEDLINE | ID: mdl-34146969

ABSTRACT

Relying on the rapidly increasing capacity of computing clusters and hardware, convolutional neural networks (CNNs) have been successfully applied in various fields and achieved state-of-the-art results. Despite these exciting developments, the huge memory cost is still involved in training and inferring a large-scale CNN model and makes it hard to be widely used in resource-limited portable devices. To address this problem, we establish a training framework for three-dimensional convolutional neural networks (3DCNNs) named QTTNet that combines tensor train (TT) decomposition and data quantization together for further shrinking the model size and decreasing the memory and time cost. Through this framework, we can fully explore the superiority of TT in reducing the number of trainable parameters and the advantage of quantization in decreasing the bit-width of data, particularly compressing 3DCNN model greatly with little accuracy degradation. In addition, due to the low bit quantization to all parameters during the inference process including TT-cores, activations, and batch normalizations, the proposed method naturally takes advantage in memory and time cost. Experimental results of compressing 3DCNNs for 3D object and video recognition on ModelNet40, UCF11, and UCF50 datasets verify the effectiveness of the proposed method. The best compression ratio we have obtained is up to nearly 180× with competitive performance compared with other state-of-the-art researches. Moreover, the total bytes of our QTTNet models on ModelNet40 and UCF11 datasets can be 1000× lower than some typical practices such as MVCNN.

Subject(s)

Neural Networks, Computer , Data Compression , Imaging, Three-Dimensional

8.

Hybrid tensor decomposition in neural network compression.

Wu, Bijiao; Wang, Dingheng; Zhao, Guangshe; Deng, Lei; Li, Guoqi.

Neural Netw ; 132: 309-320, 2020 Dec.

Article in English | MEDLINE | ID: mdl-32977276

ABSTRACT

Deep neural networks (DNNs) have enabled impressive breakthroughs in various artificial intelligence (AI) applications recently due to its capability of learning high-level features from big data. However, the current demand of DNNs for computational resources especially the storage consumption is growing due to that the increasing sizes of models are being required for more and more complicated applications. To address this problem, several tensor decomposition methods including tensor-train (TT) and tensor-ring (TR) have been applied to compress DNNs and shown considerable compression effectiveness. In this work, we introduce the hierarchical Tucker (HT), a classical but rarely-used tensor decomposition method, to investigate its capability in neural network compression. We convert the weight matrices and convolutional kernels to both HT and TT formats for comparative study, since the latter is the most widely used decomposition method and the variant of HT. We further theoretically and experimentally discover that the HT format has better performance on compressing weight matrices, while the TT format is more suited for compressing convolutional kernels. Based on this phenomenon we propose a strategy of hybrid tensor decomposition by combining TT and HT together to compress convolutional and fully connected parts separately and attain better accuracy than only using the TT or HT format on convolutional neural networks (CNNs). Our work illuminates the prospects of hybrid tensor decomposition for neural network compression.

Subject(s)

Data Compression/methods , Deep Learning , Neural Networks, Computer , Algorithms , Artificial Intelligence

9.

Compressing 3DCNNs based on tensor train decomposition.

Wang, Dingheng; Zhao, Guangshe; Li, Guoqi; Deng, Lei; Wu, Yang.

Neural Netw ; 131: 215-230, 2020 Nov.

Article in English | MEDLINE | ID: mdl-32805632

ABSTRACT

Three-dimensional convolutional neural networks (3DCNNs) have been applied in many tasks, e.g., video and 3D point cloud recognition. However, due to the higher dimension of convolutional kernels, the space complexity of 3DCNNs is generally larger than that of traditional two-dimensional convolutional neural networks (2DCNNs). To miniaturize 3DCNNs for the deployment in confining environments such as embedded devices, neural network compression is a promising approach. In this work, we adopt the tensor train (TT) decomposition, a straightforward and simple in situ training compression method, to shrink the 3DCNN models. Through proposing tensorizing 3D convolutional kernels in TT format, we investigate how to select appropriate TT ranks for achieving higher compression ratio. We have also discussed the redundancy of 3D convolutional kernels for compression, core significance and future directions of this work, as well as the theoretical computation complexity versus practical executing time of convolution in TT. In the light of multiple contrast experiments based on VIVA challenge, UCF11, UCF101, and ModelNet40 datasets, we conclude that TT decomposition can compress 3DCNNs by around one hundred times without significant accuracy loss, which will enable its applications in extensive real world scenarios.

Subject(s)

Data Compression/methods , Neural Networks, Computer

10.

Rethinking the performance comparison between SNNS and ANNS.

Deng, Lei; Wu, Yujie; Hu, Xing; Liang, Ling; Ding, Yufei; Li, Guoqi; Zhao, Guangshe; Li, Peng; Xie, Yuan.

Neural Netw ; 121: 294-307, 2020 Jan.

Article in English | MEDLINE | ID: mdl-31586857

ABSTRACT

Artificial neural networks (ANNs), a popular path towards artificial intelligence, have experienced remarkable success via mature models, various benchmarks, open-source datasets, and powerful computing platforms. Spiking neural networks (SNNs), a category of promising models to mimic the neuronal dynamics of the brain, have gained much attention for brain inspired computing and been widely deployed on neuromorphic devices. However, for a long time, there are ongoing debates and skepticisms about the value of SNNs in practical applications. Except for the low power attribute benefit from the spike-driven processing, SNNs usually perform worse than ANNs especially in terms of the application accuracy. Recently, researchers attempt to address this issue by borrowing learning methodologies from ANNs, such as backpropagation, to train high-accuracy SNN models. The rapid progress in this domain continuously produces amazing results with ever-increasing network size, whose growing path seems similar to the development of deep learning. Although these ways endow SNNs the capability to approach the accuracy of ANNs, the natural superiorities of SNNs and the way to outperform ANNs are potentially lost due to the use of ANN-oriented workloads and simplistic evaluation metrics. In this paper, we take the visual recognition task as a case study to answer the questions of "what workloads are ideal for SNNs and how to evaluate SNNs makes sense". We design a series of contrast tests using different types of datasets (ANN-oriented and SNN-oriented), diverse processing models, signal conversion methods, and learning algorithms. We propose comprehensive metrics on the application accuracy and the cost of memory & compute to evaluate these models, and conduct extensive experiments. We evidence the fact that on ANN-oriented workloads, SNNs fail to beat their ANN counterparts; while on SNN-oriented workloads, SNNs can fully perform better. We further demonstrate that in SNNs there exists a trade-off between the application accuracy and the execution cost, which will be affected by the simulation time window and firing threshold. Based on these abundant analyses, we recommend the most suitable model for each scenario. To the best of our knowledge, this is the first work using systematical comparisons to explicitly reveal that the straightforward workload porting from ANNs to SNNs is unwise although many works are doing so and a comprehensive evaluation indeed matters. Finally, we highlight the urgent need to build a benchmarking framework for SNNs with broader tasks, datasets, and metrics.

Subject(s)

Action Potentials/physiology , Artificial Intelligence , Neural Networks, Computer , Pattern Recognition, Automated/methods , Algorithms , Brain/physiology , Humans , Memory/physiology , Neurons/physiology

11.

Seeker-Azimuth Determination with Gyro Rotor and Optoelectronic Sensors.

Bai, Jian-Ming; Zhao, Guangshe; Rong, Hai-Jun; Wang, Xianhua.

Sensors (Basel) ; 18(4)2018 Apr 19.

Article in English | MEDLINE | ID: mdl-29671757

ABSTRACT

This paper presents an approach to seeker-azimuth determination using the gyro rotor and optoelectronic sensors. In the proposed method, the gyro rotor is designed with a set of black and white right spherical triangle patterns on its surface. Two pairs of optoelectronic sensors are located symmetrically around the gyro rotor. When there is an azimuth, the stripe width covering the black and white patterns changes. The optoelectronic sensors then capture the reflected optical signals from the different black and white pattern stripes on the gyro rotor and produce the duty ratio signal. The functional relationship between the measured duty ratio and the azimuth information is numerically derived, and, based on this relationship, the azimuth is determined from the measured duty ratio. Experimental results show that the proposed approach produces a large azimuth range and high measurement accuracy with the linearity error of less than 0.005.

12.

Entropy Based Modelling for Estimating Demographic Trends.

Li, Guoqi; Zhao, Daxuan; Xu, Yi; Kuo, Shyh-Hao; Xu, Hai-Yan; Hu, Nan; Zhao, Guangshe; Monterola, Christopher.

PLoS One ; 10(9): e0137324, 2015.

Article in English | MEDLINE | ID: mdl-26382594

ABSTRACT

In this paper, an entropy-based method is proposed to forecast the demographical changes of countries. We formulate the estimation of future demographical profiles as a constrained optimization problem, anchored on the empirically validated assumption that the entropy of age distribution is increasing in time. The procedure of the proposed method involves three stages, namely: 1) Prediction of the age distribution of a country's population based on an "age-structured population model"; 2) Estimation the age distribution of each individual household size with an entropy-based formulation based on an "individual household size model"; and 3) Estimation the number of each household size based on a "total household size model". The last stage is achieved by projecting the age distribution of the country's population (obtained in stage 1) onto the age distributions of individual household sizes (obtained in stage 2). The effectiveness of the proposed method is demonstrated by feeding real world data, and it is general and versatile enough to be extended to other time dependent demographic variables.

Subject(s)

Entropy , Family Characteristics , Population Dynamics/trends , Algorithms , Computer Simulation , Humans , Models, Theoretical , United States

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL