Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add more filters










Database
Language
Publication year range
1.
IEEE Trans Neural Netw Learn Syst ; 28(10): 2222-2232, 2017 10.
Article in English | MEDLINE | ID: mdl-27411231

ABSTRACT

Several variants of the long short-term memory (LSTM) architecture for recurrent neural networks have been proposed since its inception in 1995. In recent years, these networks have become the state-of-the-art models for a variety of machine learning problems. This has led to a renewed interest in understanding the role and utility of various computational components of typical LSTM variants. In this paper, we present the first large-scale analysis of eight LSTM variants on three representative tasks: speech recognition, handwriting recognition, and polyphonic music modeling. The hyperparameters of all LSTM variants for each task were optimized separately using random search, and their importance was assessed using the powerful functional ANalysis Of VAriance framework. In total, we summarize the results of 5400 experimental runs ( ≈ 15 years of CPU time), which makes our study the largest of its kind on LSTM networks. Our results show that none of the variants can improve upon the standard LSTM architecture significantly, and demonstrate the forget gate and the output activation function to be its most critical components. We further observe that the studied hyperparameters are virtually independent and derive guidelines for their efficient adjustment.

2.
Neural Netw ; 23(4): 568-82, 2010 May.
Article in English | MEDLINE | ID: mdl-20227243

ABSTRACT

Optimization of neural network topology, weights and neuron transfer functions for given data set and problem is not an easy task. In this article, we focus primarily on building optimal feed-forward neural network classifier for i.i.d. data sets. We apply meta-learning principles to the neural network structure and function optimization. We show that diversity promotion, ensembling, self-organization and induction are beneficial for the problem. We combine several different neuron types trained by various optimization algorithms to build a supervised feed-forward neural network called Group of Adaptive Models Evolution (GAME). The approach was tested on a large number of benchmark data sets. The experiments show that the combination of different optimization algorithms in the network is the best choice when the performance is averaged over several real-world problems.


Subject(s)
Learning , Nerve Net , Neural Networks, Computer , Algorithms , Artificial Intelligence , Computer Simulation , Models, Biological , Neurons , Pattern Recognition, Automated
SELECTION OF CITATIONS
SEARCH DETAIL
...