Search | VHL Regional Portal

Two-layer contractive encodings for learning stable nonlinear features.

Schulz, Hannes; Cho, Kyunghyun; Raiko, Tapani; Behnke, Sven.

Neural Netw ; 64: 4-11, 2015 Apr.

Article in English | MEDLINE | ID: mdl-25292461

ABSTRACT

Unsupervised learning of feature hierarchies is often a good strategy to initialize deep architectures for supervised learning. Most existing deep learning methods build these feature hierarchies layer by layer in a greedy fashion using either auto-encoders or restricted Boltzmann machines. Both yield encoders which compute linear projections of input followed by a smooth thresholding function. In this work, we demonstrate that these encoders fail to find stable features when the required computation is in the exclusive-or class. To overcome this limitation, we propose a two-layer encoder which is less restricted in the type of features it can learn. The proposed encoder is regularized by an extension of previous work on contractive regularization. This proposed two-layer contractive encoder potentially poses a more difficult optimization problem, and we further propose to linearly transform hidden neurons of the encoder to make learning easier. We demonstrate the advantages of the two-layer encoders qualitatively on artificially constructed datasets as well as commonly used benchmark datasets. We also conduct experiments on a semi-supervised learning task and show the benefits of the proposed two-layer encoders trained with the linear transformation of perceptrons.

Subject(s)

Algorithms , Artificial Intelligence , Neural Networks, Computer

Measuring the usefulness of hidden units in Boltzmann machines with mutual information.

Berglund, Mathias; Raiko, Tapani; Cho, Kyunghyun.

Neural Netw ; 64: 12-8, 2015 Apr.

Article in English | MEDLINE | ID: mdl-25318376

ABSTRACT

Restricted Boltzmann machines (RBMs) and deep Boltzmann machines (DBMs) are important models in deep learning, but it is often difficult to measure their performance in general, or measure the importance of individual hidden units in specific. We propose to use mutual information to measure the usefulness of individual hidden units in Boltzmann machines. The measure is fast to compute, and serves as an upper bound for the information the neuron can pass on, enabling detection of a particular kind of poor training results. We confirm experimentally that the proposed measure indicates how much the performance of the model drops when some of the units of an RBM are pruned away. We demonstrate the usefulness of the measure for early detection of poor training in DBMs.

Subject(s)

Algorithms , Neural Networks, Computer

Enhanced gradient for training restricted Boltzmann machines.

Cho, Kyunghyun; Raiko, Tapani; Ilin, Alexander.

Neural Comput ; 25(3): 805-31, 2013 Mar.

Article in English | MEDLINE | ID: mdl-23148412

ABSTRACT

Restricted Boltzmann machines (RBMs) are often used as building blocks in greedy learning of deep networks. However, training this simple model can be laborious. Traditional learning algorithms often converge only with the right choice of metaparameters that specify, for example, learning rate scheduling and the scale of the initial weights. They are also sensitive to specific data representation. An equivalent RBM can be obtained by flipping some bits and changing the weights and biases accordingly, but traditional learning rules are not invariant to such transformations. Without careful tuning of these training settings, traditional algorithms can easily get stuck or even diverge. In this letter, we present an enhanced gradient that is derived to be invariant to bit-flipping transformations. We experimentally show that the enhanced gradient yields more stable training of RBMs both when used with a fixed learning rate and an adaptive one.

Subject(s)

Algorithms , Artificial Intelligence , Computer Simulation

Oscillatory neural network for image segmentation with biased competition for attention.

Raiko, Tapani; Valpola, Harri.

Adv Exp Med Biol ; 718: 75-85, 2011.

Article in English | MEDLINE | ID: mdl-21744211

ABSTRACT

We study the emergent properties of an artificial neural network which combines segmentation by oscillations and biased competition for perceptual processing. The aim is to progress in image segmentation by mimicking abstractly the way how the cerebral cortex works. In our model, the neurons associated with features belonging to an object start to oscillate synchronously, while competing objects oscillate with an opposing phase. The emergent properties of the network are confirmed by experiments with artificial image data.

Subject(s)

Attention , Models, Theoretical , Nerve Net , Neural Networks, Computer , Learning

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL