Search | VHL Regional Portal

1.

Efficient SNN multi-cores MAC array acceleration on SpiNNaker 2.

Huang, Jiaxin; Kelber, Florian; Vogginger, Bernhard; Liu, Chen; Kreutz, Felix; Gerhards, Pascal; Scholz, Daniel; Knobloch, Klaus; Mayr, Christian G.

Front Neurosci ; 17: 1223262, 2023.

Article in English | MEDLINE | ID: mdl-37609449

ABSTRACT

The potential low-energy feature of the spiking neural network (SNN) engages the attention of the AI community. Only CPU-involved SNN processing inevitably results in an inherently long temporal span in the cases of large models and massive datasets. This study introduces the MAC array, a parallel architecture on each processing element (PE) of SpiNNaker 2, into the computational process of SNN inference. Based on the work of single-core optimization algorithms, we investigate the parallel acceleration algorithms for collaborating with multi-core MAC arrays. The proposed Echelon Reorder model information densification algorithm, along with the adapted multi-core two-stage splitting and authorization deployment strategies, achieves efficient spatio-temporal load balancing and optimization performance. We evaluate the performance by benchmarking a wide range of constructed SNN models to research on the influence degree of different factors. We also benchmark with two actual SNN models (the gesture recognition model of the real-world application and balanced random cortex-like network from neuroscience) on the neuromorphic multi-core hardware SpiNNaker 2. The echelon optimization algorithm with mixed processors realizes 74.28% and 85.78% memory footprint of the original MAC calculation on these two models, respectively. The execution time of echelon algorithms using only MAC or mixed processors accounts for ≤ 24.56% of the serial ARM baseline. Accelerating SNN inference with algorithms in this study is essentially the general sparse matrix-matrix multiplication (SpGEMM) problem. This article explicitly expands the application field of the SpGEMM issue to SNN, developing novel SpGEMM optimization algorithms fitting the SNN feature and MAC array.

2.

E-prop on SpiNNaker 2: Exploring online learning in spiking RNNs on neuromorphic hardware.

Rostami, Amirhossein; Vogginger, Bernhard; Yan, Yexin; Mayr, Christian G.

Front Neurosci ; 16: 1018006, 2022.

Article in English | MEDLINE | ID: mdl-36518534

ABSTRACT

Introduction: In recent years, the application of deep learning models at the edge has gained attention. Typically, artificial neural networks (ANNs) are trained on graphics processing units (GPUs) and optimized for efficient execution on edge devices. Training ANNs directly at the edge is the next step with many applications such as the adaptation of models to specific situations like changes in environmental settings or optimization for individuals, e.g., optimization for speakers for speech processing. Also, local training can preserve privacy. Over the last few years, many algorithms have been developed to reduce memory footprint and computation. Methods: A specific challenge to train recurrent neural networks (RNNs) for processing sequential data is the need for the Back Propagation Through Time (BPTT) algorithm to store the network state of all time steps. This limitation is resolved by the biologically-inspired E-prop approach for training Spiking Recurrent Neural Networks (SRNNs). We implement the E-prop algorithm on a prototype of the SpiNNaker 2 neuromorphic system. A parallelization strategy is developed to split and train networks on the ARM cores of SpiNNaker 2 to make efficient use of both memory and compute resources. We trained an SRNN from scratch on SpiNNaker 2 in real-time on the Google Speech Command dataset for keyword spotting. Result: We achieved an accuracy of 91.12% while requiring only 680 KB of memory for training the network with 25 K weights. Compared to other spiking neural networks with equal or better accuracy, our work is significantly more memory-efficient. Discussion: In addition, we performed a memory and time profiling of the E-prop algorithm. This is used on the one hand to discuss whether E-prop or BPTT is better suited for training a model at the edge and on the other hand to explore architecture modifications to SpiNNaker 2 to speed up online learning. Finally, energy estimations predict that the SRNN can be trained on SpiNNaker2 with 12 times less energy than using a NVIDIA V100 GPU.

3.

A Biohybrid Setup for Coupling Biological and Neuromorphic Neural Networks.

Keren, Hanna; Partzsch, Johannes; Marom, Shimon; Mayr, Christian G.

Front Neurosci ; 13: 432, 2019.

Article in English | MEDLINE | ID: mdl-31133779

ABSTRACT

Developing technologies for coupling neural activity and artificial neural components, is key for advancing neural interfaces and neuroprosthetics. We present a biohybrid experimental setting, where the activity of a biological neural network is coupled to a biomimetic hardware network. The implementation of the hardware network (denoted NeuroSoC) exhibits complex dynamics with a multiplicity of time-scales, emulating 2880 neurons and 12.7 M synapses, designed on a VLSI chip. This network is coupled to a neural network in vitro, where the activities of both the biological and the hardware networks can be recorded, processed, and integrated bidirectionally in real-time. This experimental setup enables an adjustable and well-monitored coupling, while providing access to key functional features of neural networks. We demonstrate the feasibility to functionally couple the two networks and to implement control circuits to modify the biohybrid activity. Overall, we provide an experimental model for neuromorphic-neural interfaces, hopefully to advance the capability to interface with neural activity, and with its irregularities in pathology.

4.

Memory-Efficient Deep Learning on a SpiNNaker 2 Prototype.

Liu, Chen; Bellec, Guillaume; Vogginger, Bernhard; Kappel, David; Partzsch, Johannes; Neumärker, Felix; Höppner, Sebastian; Maass, Wolfgang; Furber, Steve B; Legenstein, Robert; Mayr, Christian G.

Front Neurosci ; 12: 840, 2018.

Article in English | MEDLINE | ID: mdl-30505263

ABSTRACT

The memory requirement of deep learning algorithms is considered incompatible with the memory restriction of energy-efficient hardware. A low memory footprint can be achieved by pruning obsolete connections or reducing the precision of connection strengths after the network has been trained. Yet, these techniques are not applicable to the case when neural networks have to be trained directly on hardware due to the hard memory constraints. Deep Rewiring (DEEP R) is a training algorithm which continuously rewires the network while preserving very sparse connectivity all along the training procedure. We apply DEEP R to a deep neural network implementation on a prototype chip of the 2nd generation SpiNNaker system. The local memory of a single core on this chip is limited to 64 KB and a deep network architecture is trained entirely within this constraint without the use of external memory. Throughout training, the proportion of active connections is limited to 1.3%. On the handwritten digits dataset MNIST, this extremely sparse network achieves 96.6% classification accuracy at convergence. Utilizing the multi-processor feature of the SpiNNaker system, we found very good scaling in terms of computation time, per-core memory consumption, and energy constraints. When compared to a X86 CPU implementation, neural network training on the SpiNNaker 2 prototype improves power and energy consumption by two orders of magnitude.

5.

Editorial: Synaptic Plasticity for Neuromorphic Systems.

Mayr, Christian G; Sheik, Sadique; Bartolozzi, Chiara; Chicca, Elisabetta.

Front Neurosci ; 10: 214, 2016.

Article in English | MEDLINE | ID: mdl-27242418

6.

Implementation of a spike-based perceptron learning rule using TiO2-x memristors.

Mostafa, Hesham; Khiat, Ali; Serb, Alexander; Mayr, Christian G; Indiveri, Giacomo; Prodromakis, Themis.

Front Neurosci ; 9: 357, 2015.

Article in English | MEDLINE | ID: mdl-26483629

ABSTRACT

Synaptic plasticity plays a crucial role in allowing neural networks to learn and adapt to various input environments. Neuromorphic systems need to implement plastic synapses to obtain basic "cognitive" capabilities such as learning. One promising and scalable approach for implementing neuromorphic synapses is to use nano-scale memristors as synaptic elements. In this paper we propose a hybrid CMOS-memristor system comprising CMOS neurons interconnected through TiO2-x memristors, and spike-based learning circuits that modulate the conductance of the memristive synapse elements according to a spike-based Perceptron plasticity rule. We highlight a number of advantages for using this spike-based plasticity rule as compared to other forms of spike timing dependent plasticity (STDP) rules. We provide experimental proof-of-concept results with two silicon neurons connected through a memristive synapse that show how the CMOS plasticity circuits can induce stable changes in memristor conductances, giving rise to increased synaptic strength after a potentiation episode and to decreased strength after a depression episode.

7.

Single pairing spike-timing dependent plasticity in BiFeO3 memristors with a time window of 25 ms to 125 µs.

Du, Nan; Kiani, Mahdi; Mayr, Christian G; You, Tiangui; Bürger, Danilo; Skorupa, Ilona; Schmidt, Oliver G; Schmidt, Heidemarie.

Front Neurosci ; 9: 227, 2015.

Article in English | MEDLINE | ID: mdl-26175666

ABSTRACT

Memristive devices are popular among neuromorphic engineers for their ability to emulate forms of spike-driven synaptic plasticity by applying specific voltage and current waveforms at their two terminals. In this paper, we investigate spike-timing dependent plasticity (STDP) with a single pairing of one presynaptic voltage spike and one post-synaptic voltage spike in a BiFeO3 memristive device. In most memristive materials the learning window is primarily a function of the material characteristics and not of the applied waveform. In contrast, we show that the analog resistive switching of the developed artificial synapses allows to adjust the learning time constant of the STDP function from 25 ms to 125 µs via the duration of applied voltage spikes. Also, as the induced weight change may degrade, we investigate the remanence of the resistance change for several hours after analog resistive switching, thus emulating the processes expected in biological synapses. As the power consumption is a major constraint in neuromorphic circuits, we show methods to reduce the consumed energy per setting pulse to only 4.5 pJ in the developed artificial synapses.

8.

Plasticity in memristive devices for spiking neural networks.

Saïghi, Sylvain; Mayr, Christian G; Serrano-Gotarredona, Teresa; Schmidt, Heidemarie; Lecerf, Gwendal; Tomas, Jean; Grollier, Julie; Boyn, Sören; Vincent, Adrien F; Querlioz, Damien; La Barbera, Selina; Alibart, Fabien; Vuillaume, Dominique; Bichler, Olivier; Gamrat, Christian; Linares-Barranco, Bernabé.

Front Neurosci ; 9: 51, 2015.

Article in English | MEDLINE | ID: mdl-25784849

ABSTRACT

Memristive devices present a new device technology allowing for the realization of compact non-volatile memories. Some of them are already in the process of industrialization. Additionally, they exhibit complex multilevel and plastic behaviors, which make them good candidates for the implementation of artificial synapses in neuromorphic engineering. However, memristive effects rely on diverse physical mechanisms, and their plastic behaviors differ strongly from one technology to another. Here, we present measurements performed on different memristive devices and the opportunities that they provide. We show that they can be used to implement different learning rules whose properties emerge directly from device physics: real time or accelerated operation, deterministic or stochastic behavior, long term or short term plasticity. We then discuss how such devices might be integrated into a complete architecture. These results highlight that there is no unique way to exploit memristive devices in neuromorphic systems. Understanding and embracing device physics is the key for their optimal use.

9.

Switched-capacitor realization of presynaptic short-term-plasticity and stop-learning synapses in 28 nm CMOS.

Noack, Marko; Partzsch, Johannes; Mayr, Christian G; Hänzsche, Stefan; Scholze, Stefan; Höppner, Sebastian; Ellguth, Georg; Schüffny, Rene.

Front Neurosci ; 9: 10, 2015.

Article in English | MEDLINE | ID: mdl-25698914

ABSTRACT

Synaptic dynamics, such as long- and short-term plasticity, play an important role in the complexity and biological realism achievable when running neural networks on a neuromorphic IC. For example, they endow the IC with an ability to adapt and learn from its environment. In order to achieve the millisecond to second time constants required for these synaptic dynamics, analog subthreshold circuits are usually employed. However, due to process variation and leakage problems, it is almost impossible to port these types of circuits to modern sub-100nm technologies. In contrast, we present a neuromorphic system in a 28 nm CMOS process that employs switched capacitor (SC) circuits to implement 128 short term plasticity presynapses as well as 8192 stop-learning synapses. The neuromorphic system consumes an area of 0.36 mm(2) and runs at a power consumption of 1.9 mW. The circuit makes use of a technique for minimizing leakage effects allowing for real-time operation with time constants up to several seconds. Since we rely on SC techniques for all calculations, the system is composed of only generic mixed-signal building blocks. These generic building blocks make the system easy to port between technologies and the large digital circuit part inherent in an SC system benefits fully from technology scaling.

10.

Configurable analog-digital conversion using the neural engineering framework.

Mayr, Christian G; Partzsch, Johannes; Noack, Marko; Schüffny, Rene.

Front Neurosci ; 8: 201, 2014.

Article in English | MEDLINE | ID: mdl-25100933

ABSTRACT

Efficient Analog-Digital Converters (ADC) are one of the mainstays of mixed-signal integrated circuit design. Besides the conventional ADCs used in mainstream ICs, there have been various attempts in the past to utilize neuromorphic networks to accomplish an efficient crossing between analog and digital domains, i.e., to build neurally inspired ADCs. Generally, these have suffered from the same problems as conventional ADCs, that is they require high-precision, handcrafted analog circuits and are thus not technology portable. In this paper, we present an ADC based on the Neural Engineering Framework (NEF). It carries out a large fraction of the overall ADC process in the digital domain, i.e., it is easily portable across technologies. The analog-digital conversion takes full advantage of the high degree of parallelism inherent in neuromorphic networks, making for a very scalable ADC. In addition, it has a number of features not commonly found in conventional ADCs, such as a runtime reconfigurability of the ADC sampling rate, resolution and transfer characteristic.

11.

Rate and pulse based plasticity governed by local synaptic state variables.

Mayr, Christian G; Partzsch, Johannes.

Front Synaptic Neurosci ; 2: 33, 2010.

Article in English | MEDLINE | ID: mdl-21423519

ABSTRACT

Classically, action-potential-based learning paradigms such as the Bienenstock-Cooper-Munroe (BCM) rule for pulse rates or spike timing-dependent plasticity for pulse pairings have been experimentally demonstrated to evoke long-lasting synaptic weight changes (i.e., plasticity). However, several recent experiments have shown that plasticity also depends on the local dynamics at the synapse, such as membrane voltage, Calcium time course and level, or dendritic spikes. In this paper, we introduce a formulation of the BCM rule which is based on the instantaneous postsynaptic membrane potential as well as the transmission profile of the presynaptic spike. While this rule incorporates only simple local voltage- and current dynamics and is thus neither directly rate nor timing based, it can replicate a range of experiments, such as various rate and spike pairing protocols, combinations of the two, as well as voltage-dependent plasticity. A detailed comparison of current plasticity models with respect to this range of experiments also demonstrates the efficacy of the new plasticity rule. All experiments can be replicated with a limited set of parameters, avoiding the overfitting problem of more involved plasticity rules.

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL