Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 23
Filter
Add more filters










Publication year range
1.
Article in English | MEDLINE | ID: mdl-37310825

ABSTRACT

The dual neural network (DNN)-based k -winner-take-all (WTA) model is able to identify the k largest numbers from its m input numbers. When there are imperfections, such as non-ideal step function and Gaussian input noise, in the realization, the model may not output the correct result. This brief analyzes the influence of the imperfections on the operational correctness of the model. Due to the imperfections, it is not efficient to use the original DNN- k WTA dynamics for analyzing the influence. In this regard, this brief first derives an equivalent model to describe the dynamics of the model under the imperfections. From the equivalent model, we derive a sufficient condition for which the model outputs the correct result. Thus, we apply the sufficient condition to design an efficiently estimation method for the probability of the model outputting the correct result. Furthermore, for the inputs with uniform distribution, a closed form expression for the probability value is derived. Finally, we extend our analysis for handling non-Gaussian input noise. Simulation results are provided to validate our theoretical results.

2.
IEEE Trans Neural Netw Learn Syst ; 34(8): 5218-5226, 2023 Aug.
Article in English | MEDLINE | ID: mdl-34847045

ABSTRACT

The objective of compressive sampling is to determine a sparse vector from an observation vector. This brief describes an analog neural method to achieve the objective. Unlike previous analog neural models which either resort to the l1 -norm approximation or are with local convergence only, the proposed method avoids any approximation of the l1 -norm term and is probably capable of leading to the optimum solution. Moreover, its computational complexity is lower than that of the other three comparison analog models. Simulation results show that the error performance of the proposed model is comparable to several state-of-the-art digital algorithms and analog models and that its convergence is faster than that of the comparison analog neural models.

3.
IEEE Trans Neural Netw Learn Syst ; 34(5): 2619-2632, 2023 May.
Article in English | MEDLINE | ID: mdl-34487503

ABSTRACT

For decades, adding fault/noise during training by gradient descent has been a technique for getting a neural network (NN) tolerant to persistent fault/noise or getting an NN with better generalization. In recent years, this technique has been readvocated in deep learning to avoid overfitting. Yet, the objective function of such fault/noise injection learning has been misinterpreted as the desired measure (i.e., the expected mean squared error (mse) of the training samples) of the NN with the same fault/noise. The aims of this article are: 1) to clarify the above misconception and 2) investigate the actual regularization effect of adding node fault/noise when training by gradient descent. Based on the previous works on adding fault/noise during training, we speculate the reason why the misconception appears. In the sequel, it is shown that the learning objective of adding random node fault during gradient descent learning (GDL) for a multilayer perceptron (MLP) is identical to the desired measure of the MLP with the same fault. If additive (resp. multiplicative) node noise is added during GDL for an MLP, the learning objective is not identical to the desired measure of the MLP with such noise. For radial basis function (RBF) networks, it is shown that the learning objective is identical to the corresponding desired measure for all three fault/noise conditions. Empirical evidence is presented to support the theoretical results and, hence, clarify the misconception that the objective function of a fault/noise injection learning might not be interpreted as the desired measure of the NN with the same fault/noise. Afterward, the regularization effect of adding node fault/noise during training is revealed for the case of RBF networks. Notably, it is shown that the regularization effect of adding additive or multiplicative node noise (MNN) during training an RBF is reducing network complexity. Applying dropout regularization in RBF networks, its effect is the same as adding MNN during training.

4.
IEEE Trans Neural Netw Learn Syst ; 33(7): 3184-3192, 2022 Jul.
Article in English | MEDLINE | ID: mdl-33513113

ABSTRACT

The dual neural network-based k -winner-take-all (DNN- k WTA) is an analog neural model that is used to identify the k largest numbers from n inputs. Since threshold logic units (TLUs) are key elements in the model, offset voltage drifts in TLUs may affect the operational correctness of a DNN- k WTA network. Previous studies assume that drifts in TLUs follow some particular distributions. This brief considers that only the drift range, given by [-∆, ∆] , is available. We consider two drift cases: time-invariant and time-varying. For the time-invariant case, we show that the state of a DNN- k WTA network converges. The sufficient condition to make a network with the correct operation is given. Furthermore, for uniformly distributed inputs, we prove that the probability that a DNN- k WTA network operates properly is greater than (1-2∆)n . The aforementioned results are generalized for the time-varying case. In addition, for the time-invariant case, we derive a method to compute the exact convergence time for a given data set. For uniformly distributed inputs, we further derive the mean and variance of the convergence time. The convergence time results give us an idea about the operational speed of the DNN- k WTA model. Finally, simulation experiments have been conducted to validate those theoretical results.

5.
IEEE Trans Neural Netw Learn Syst ; 31(6): 2227-2232, 2020 Jun.
Article in English | MEDLINE | ID: mdl-31398136

ABSTRACT

Over decades, gradient descent has been applied to develop learning algorithm to train a neural network (NN). In this brief, a limitation of applying such algorithm to train an NN with persistent weight noise is revealed. Let V(w) be the performance measure of an ideal NN. V(w) is applied to develop the gradient descent learning (GDL). With weight noise, the desired performance measure (denoted as J(w) ) is E[V(~w)|w] , where ~w is the noisy weight vector. Applying GDL to train an NN with weight noise, the actual learning objective is clearly not V(w) but another scalar function L(w) . For decades, there is a misconception that L(w) = J(w) , and hence, the actual model attained by the GDL is the desired model. However, we show that it might not: 1) with persistent additive weight noise, the actual model attained is the desired model as L(w) = J(w) ; and 2) with persistent multiplicative weight noise, the actual model attained is unlikely the desired model as L(w) ≠ J(w) . Accordingly, the properties of the models attained as compared with the desired models are analyzed and the learning curves are sketched. Simulation results on 1) a simple regression problem and 2) the MNIST handwritten digit recognition are presented to support our claims.

6.
IEEE Trans Neural Netw Learn Syst ; 30(10): 3200-3204, 2019 Oct.
Article in English | MEDLINE | ID: mdl-30668482

ABSTRACT

This brief presents analytical results on the effect of additive weight/bias noise on a Boltzmann machine (BM), in which the unit output is in {-1, 1} instead of {0, 1}. With such noise, it is found that the state distribution is yet another Boltzmann distribution but the temperature factor is elevated. Thus, the desired gradient ascent learning algorithm is derived, and the corresponding learning procedure is developed. This learning procedure is compared with the learning procedure applied to train a BM with noise. It is found that these two procedures are identical. Therefore, the learning algorithm for noise-free BMs is suitable for implementing as an online learning algorithm for an analog circuit-implemented BM, even if the variances of the additive weight noise and bias noise are unknown.

7.
IEEE Trans Neural Netw Learn Syst ; 29(9): 4212-4222, 2018 09.
Article in English | MEDLINE | ID: mdl-29989975

ABSTRACT

In this paper, the effect of input noise, output node stochastic, and recurrent state noise on the Wang $k$ WTA is analyzed. Here, we assume that noise exists at the recurrent state $y(t)$ and it can either be additive or multiplicative. Besides, its dynamical change (i.e., $dy/dt$ ) is corrupted by noise as well. In sequel, we model the dynamics of $y(t)$ as a stochastic differential equation and show that the stochastic behavior of $y(t)$ is equivalent to an Ito diffusion. Its stationary distribution is a Gibbs distribution, whose modality depends on the noise condition. With moderate input noise and very small recurrent state noise, the distribution is single modal and hence $y(\infty )$ has high probability varying within the input values of the $k$ and $k+1$ winners (i.e., correct output). With small input noise and large recurrent state noise, the distribution could be multimodal and hence $y(\infty )$ could have probability varying outside the input values of the $k$ and $k+1$ winners (i.e., incorrect output). In this regard, we further derive the conditions that the $k$ WTA has high probability giving correct output. Our results reveal that recurrent state noise could have severe effect on Wang $k$ WTA. But, input noise and output node stochastic could alleviate such an effect.

8.
IEEE Trans Neural Netw Learn Syst ; 29(4): 1082-1094, 2018 04.
Article in English | MEDLINE | ID: mdl-28186910

ABSTRACT

This paper studies the effects of uniform input noise and Gaussian input noise on the dual neural network-based WTA (DNN- WTA) model. We show that the state of the network (under either uniform input noise or Gaussian input noise) converges to one of the equilibrium points. We then derive a formula to check if the network produce correct outputs or not. Furthermore, for the uniformly distributed inputs, two lower bounds (one for each type of input noise) on the probability that the network produces the correct outputs are presented. Besides, when the minimum separation amongst inputs is given, we derive the condition for the network producing the correct outputs. Finally, experimental results are presented to verify our theoretical results. Since random drift in the comparators can be considered as input noise, our results can be applied to the random drift situation.

9.
IEEE Trans Neural Netw Learn Syst ; 27(4): 863-74, 2016 Apr.
Article in English | MEDLINE | ID: mdl-26990391

ABSTRACT

Fault tolerance is one interesting property of artificial neural networks. However, the existing fault models are able to describe limited node fault situations only, such as stuck-at-zero and stuck-at-one. There is no general model that is able to describe a large class of node fault situations. This paper studies the performance of faulty radial basis function (RBF) networks for the general node fault situation. We first propose a general node fault model that is able to describe a large class of node fault situations, such as stuck-at-zero, stuck-at-one, and the stuck-at level being with arbitrary distribution. Afterward, we derive an expression to describe the performance of faulty RBF networks. An objective function is then identified from the formula. With the objective function, a training algorithm for the general node situation is developed. Finally, a mean prediction error (MPE) formula that is able to estimate the test set error of faulty networks is derived. The application of the MPE formula in the selection of basis width is elucidated. Simulation experiments are then performed to demonstrate the effectiveness of the proposed method.

10.
Case Rep Genet ; 2015: 131852, 2015.
Article in English | MEDLINE | ID: mdl-26064708

ABSTRACT

This case report concerns a 16-year-old girl with a 9.92 Mb, heterozygous interstitial chromosome deletion at 7q33-q35, identified using array comparative genomic hybridization. The patient has dysmorphic facial features, intellectual disability, recurrent infections, self-injurious behavior, obesity, and recent onset of hemihypertrophy. This patient has overlapping features with previously reported individuals who have similar deletions spanning the 7q32-q36 region. It has been difficult to describe an interstitial 7q deletion syndrome due to variations in the sizes and regions in the few patients reported in the literature. This case contributes to the further characterization of an interstitial distal 7q deletion syndrome.

11.
Case Rep Genet ; 2015: 453105, 2015.
Article in English | MEDLINE | ID: mdl-25789183

ABSTRACT

Pelizaeus-Merzbacher disease (PMD) is neurodegenerative leukodystrophy caused by dysfunction of the proteolipid protein 1 (PLP1) gene on Xq22, which codes for an essential myelin protein. As an X-linked condition, PMD primarily affects males; however there have been a small number of affected females reported in the medical literature with a variety of different mutations in this gene. No affected females to date have a deletion like our patient. In addition to this, our patient has skewed X chromosome inactivation which adds to her presentation as her unaffected mother also carries the mutation.

12.
IEEE Trans Neural Netw Learn Syst ; 26(9): 2188-93, 2015 Sep.
Article in English | MEDLINE | ID: mdl-25376043

ABSTRACT

The dual neural network (DNN)-based k -winner-take-all ( k WTA) model is an effective approach for finding the k largest inputs from n inputs. Its major assumption is that the threshold logic units (TLUs) can be implemented in a perfect way. However, when differential bipolar pairs are used for implementing TLUs, the transfer function of TLUs is a logistic function. This brief studies the properties of the DNN- kWTA model under this imperfect situation. We prove that, given any initial state, the network settles down at the unique equilibrium point. Besides, the energy function of the model is revealed. Based on the energy function, we propose an efficient method to study the model performance when the inputs are with continuous distribution functions. Furthermore, for uniformly distributed inputs, we derive a formula to estimate the probability that the model produces the correct outputs. Finally, for the case that the minimum separation ∆min of the inputs is given, we prove that if the gain of the activation function is greater than 1/4∆min max(ln 2n, 2 ln 1 - ϵ/ϵ ), then the network can produce the correct outputs with winner outputs greater than 1-ϵ and loser outputs less than ϵ, where ϵ is the threshold less than 0.5.


Subject(s)
Algorithms , Decision Support Techniques , Neural Networks, Computer , Computer Simulation , Feedback , Humans , Probability , Time Factors
13.
Am J Med Genet A ; 161A(6): 1405-8, 2013 Jun.
Article in English | MEDLINE | ID: mdl-23613140

ABSTRACT

Several recent reports of interstitial deletions at the terminal end of the short arm of chromosome 3 have helped to define the critical region whose deletion causes 3p deletion syndrome. We report on an 11-year-old girl with intellectual disability, obsessive-compulsive tendencies, hypotonia, and dysmorphic facial features in whom a 684 kb interstitial 3p25.3 deletion was characterized using array-CGH. This deletion overlaps with interstitial 3p25 deletions reported in three recent case reports. These deletions share a 124 kb overlap region including only three RefSeq annotated genes, THUMPD3, SETD5, and LOC440944. The current patient had phenotypic similarities, including intellectual disability, hypotonia, depressed nasal bridge, and long philtrum, with previously reported patients, while she did not have the cardiac defects, seizures ormicrocephaly reported in patients with larger deletions. Therefore, this patient furthers our knowledge of the consequences of 3p deletions, while suggesting genotype-phenotype correlations.


Subject(s)
Chromosomes, Human, Pair 3/genetics , Intellectual Disability/genetics , Child , Chromosome Deletion , Comparative Genomic Hybridization , Female , Genetic Association Studies , Genotype , Humans , In Situ Hybridization, Fluorescence , Muscle Hypotonia , Obsessive-Compulsive Disorder , Phenotype
14.
IEEE Trans Neural Netw Learn Syst ; 24(9): 1472-8, 2013 Sep.
Article in English | MEDLINE | ID: mdl-24808584

ABSTRACT

Recently, an analog neural network model, namely Wang's kWTA, was proposed. In this model, the output nodes are defined as the Heaviside function. Subsequently, its finite time convergence property and the exact convergence time are analyzed. However, the discovered characteristics of this model are based on the assumption that there are no physical defects during the operation. In this brief, we analyze the convergence behavior of the Wang's kWTA model when defects exist during the operation. Two defect conditions are considered. The first one is that there is input noise. The second one is that there is stochastic behavior in the output nodes. The convergence of the Wang's kWTA under these two defects is analyzed and the corresponding energy function is revealed.


Subject(s)
Neural Networks, Computer , Stochastic Processes , Animals , Computer Simulation , Humans
15.
IEEE Trans Neural Netw Learn Syst ; 23(4): 676-82, 2012 Apr.
Article in English | MEDLINE | ID: mdl-24805051

ABSTRACT

A k-winner-take-all (kWTA) network is able to find out the k largest numbers from n inputs. Recently, a dual neural network (DNN) approach was proposed to implement the kWTA process. Compared to the conventional approach, the DNN approach has much less number of interconnections. A rough upper bound on the convergence time of the DNN-kWTA model, which is expressed in terms of input variables, was given. This brief derives the exact convergence time of the DNN-kWTA model. With our result, we can study the convergence time without spending excessive time to simulate the network dynamics. We also theoretically study the statistical properties of the convergence time when the inputs are uniformly distributed. Since a nonuniform distribution can be converted into a uniform one and the conversion preserves the ordering of the inputs, our theoretical result is also valid for nonuniformly distributed inputs.


Subject(s)
Algorithms , Decision Support Techniques , Models, Statistical , Neural Networks, Computer , Computer Simulation
16.
IEEE Trans Neural Netw Learn Syst ; 23(7): 1148-55, 2012 Jul.
Article in English | MEDLINE | ID: mdl-24807140

ABSTRACT

Fault tolerance is an interesting topic in neural networks. However, many existing results on this topic focus only on the situation of a single fault source. In fact, a trained network may be affected by multiple fault sources. This brief studies the performance of faulty radial basis function (RBF) networks that suffer from multiplicative weight noise and open weight fault concurrently. We derive a mean prediction error (MPE) formula to estimate the generalization ability of faulty networks. The MPE formula provides us a way to understand the generalization ability of faulty networks without using a test set or generating a number of potential faulty networks. Based on the MPE result, we propose methods to optimize the regularization parameter, as well as the RBF width.

17.
IEEE Trans Neural Netw Learn Syst ; 23(11): 1827-40, 2012 Nov.
Article in English | MEDLINE | ID: mdl-24808076

ABSTRACT

Injecting weight noise during training is a simple technique that has been proposed for almost two decades. However, little is known about its convergence behavior. This paper studies the convergence of two weight noise injection-based training algorithms, multiplicative weight noise injection with weight decay and additive weight noise injection with weight decay. We consider that they are applied to multilayer perceptrons either with linear or sigmoid output nodes. Let w(t) be the weight vector, let V(w) be the corresponding objective function of the training algorithm, let α >; 0 be the weight decay constant, and let µ(t) be the step size. We show that if µ(t)→ 0, then with probability one E[||w(t)||2(2)] is bound and lim(t) → ∞ ||w(t)||2 exists. Based on these two properties, we show that if µ(t)→ 0, Σtµ(t)=∞, and Σtµ(t)(2) <; ∞, then with probability one these algorithms converge. Moreover, w(t) converges with probability one to a point where ∇wV(w)=0.

18.
IEEE Trans Neural Netw Learn Syst ; 23(2): 211-22, 2012 Feb.
Article in English | MEDLINE | ID: mdl-24808501

ABSTRACT

Improving fault tolerance of a neural network has been studied for more than two decades. Various training algorithms have been proposed in sequel. The on-line node fault injection-based algorithm is one of these algorithms, in which hidden nodes randomly output zeros during training. While the idea is simple, theoretical analyses on this algorithm are far from complete. This paper presents its objective function and the convergence proof. We consider three cases for multilayer perceptrons (MLPs). They are: (1) MLPs with single linear output node; (2) MLPs with multiple linear output nodes; and (3) MLPs with single sigmoid output node. For the convergence proof, we show that the algorithm converges with probability one. For the objective function, we show that the corresponding objective functions of cases (1) and (2) are of the same form. They both consist of a mean square errors term, a regularizer term, and a weight decay term. For case (3), the objective function is slight different from that of cases (1) and (2). With the objective functions derived, we can compare the similarities and differences among various algorithms and various cases.


Subject(s)
Algorithms , Models, Statistical , Neural Networks, Computer , Pattern Recognition, Automated/methods , Computer Simulation , Feedback , Online Systems , Signal-To-Noise Ratio
19.
IEEE Trans Neural Netw ; 22(2): 317-23, 2011 Feb.
Article in English | MEDLINE | ID: mdl-21189237

ABSTRACT

Injecting weight noise during training has been a simple strategy to improve the fault tolerance of multilayer perceptrons (MLPs) for almost two decades, and several online training algorithms have been proposed in this regard. However, there are some misconceptions about the objective functions being minimized by these algorithms. Some existing results misinterpret that the prediction error of a trained MLP affected by weight noise is equivalent to the objective function of a weight noise injection algorithm. In this brief, we would like to clarify these misconceptions. Two weight noise injection scenarios will be considered: one is based on additive weight noise injection and the other is based on multiplicative weight noise injection. To avoid the misconceptions, we use their mean updating equations to analyze the objective functions. For injecting additive weight noise during training, we show that the true objective function is identical to the prediction error of a faulty MLP whose weights are affected by additive weight noise. It consists of the conventional mean square error and a smoothing regularizer. For injecting multiplicative weight noise during training, we show that the objective function is different from the prediction error of a faulty MLP whose weights are affected by multiplicative weight noise. With our results, some existing misconceptions regarding MLP training with weight noise injection can now be resolved.


Subject(s)
Algorithms , Artificial Intelligence , Neural Networks, Computer , Artifacts , Nonlinear Dynamics , Pattern Recognition, Automated/methods , Software Design , Teaching/methods
20.
IEEE Trans Neural Netw ; 21(8): 1232-44, 2010 Aug.
Article in English | MEDLINE | ID: mdl-20682468

ABSTRACT

The weight-decay technique is an effective approach to handle overfitting and weight fault. For fault-free networks, without an appropriate value of decay parameter, the trained network is either overfitted or underfitted. However, many existing results on the selection of decay parameter focus on fault-free networks only. It is well known that the weight-decay method can also suppress the effect of weight fault. For the faulty case, using a test set to select the decay parameter is not practice because there are huge number of possible faulty networks for a trained network. This paper develops two mean prediction error (MPE) formulae for predicting the performance of faulty radial basis function (RBF) networks. Two fault models, multiplicative weight noise and open weight fault, are considered. Our MPE formulae involve the training error and trained weights only. Besides, in our method, we do not need to generate a huge number of faulty networks to measure the test error for the fault situation. The MPE formulae allow us to select appropriate values of decay parameter for faulty networks. Our experiments showed that, although there are small differences between the true test errors (from the test set) and the MPE values, the MPE formulae can accurately locate the appropriate value of the decay parameter for minimizing the true test error of faulty networks.


Subject(s)
Algorithms , Artifacts , Computer Simulation/standards , Neural Networks, Computer , Animals , Artificial Intelligence , Forecasting/methods , Humans , Linear Models , Nonlinear Dynamics , Predictive Value of Tests
SELECTION OF CITATIONS
SEARCH DETAIL
...