ABSTRACT
We present average performance results for dynamical inference problems in large networks, where a set of nodes is hidden while the time trajectories of the others are observed. Examples of this scenario can occur in signal transduction and gene regulation networks. We focus on the linear stochastic dynamics of continuous variables interacting via random Gaussian couplings of generic symmetry. We analyze the inference error, given by the variance of the posterior distribution over hidden paths, in the thermodynamic limit and as a function of the system parameters and the ratio α between the number of hidden and observed nodes. By applying Kalman filter recursions we find that the posterior dynamics is governed by an "effective" drift that incorporates the effect of the observations. We present two approaches for characterizing the posterior variance that allow us to tackle, respectively, equilibrium and nonequilibrium dynamics. The first appeals to Random Matrix Theory and reveals average spectral properties of the inference error and typical posterior relaxation times; the second is based on dynamical functionals and yields the inference error as the solution of an algebraic equation.
ABSTRACT
The seeded region growing (SRG) algorithm is a fast robust parameter-free method for segmenting intensity images given initial seed locations for each region. The requirement of predetermined seeds means that the model cannot operate fully autonomously. In this paper, we demonstrate a novel region growing variant of the pulse-coupled neural network (PCNN), which offers comparable performance to the SRG and is able to generate seed locations internally, opening the way to fully autonomous operation.
ABSTRACT
We develop a generalization of the Thouless-Anderson-Palmer (TAP) mean-field approach of disorder physics, which makes the method applicable to the computation of approximate averages in probabilistic models for real data. In contrast to the conventional TAP approach, where the knowledge of the distribution of couplings between the random variables is required, our method adapts to the concrete set of couplings. We show the significance of the approach in two ways: Our approach reproduces replica symmetric results for a wide class of toy models (assuming a nonglassy phase) with given disorder distributions in the thermodynamic limit. On the other hand, simulations on a real data model demonstrate that the method achieves more accurate predictions as compared to conventional TAP approaches.
ABSTRACT
Using methods of statistical physics, we investigate the role of model complexity in learning with support vector machines (SVMs), which are an important alternative to neural networks. We show the advantages of using SVMs with kernels of infinite complexity on noisy target rules, which, in contrast to common theoretical beliefs, are found to achieve optimal generalization error although the training error does not converge to the generalization error. Moreover, we find a universal asymptotics of the learning curves which depend only on the target rule but not on the SVM kernel.
ABSTRACT
We develop an advanced mean field method for approximating averages in probabilistic data models that is based on the Thouless-Anderson-Palmer (TAP) approach of disorder physics. In contrast to conventional TAP, where the knowledge of the distribution of couplings between the random variables is required, our method adapts to the concrete couplings. We demonstrate the validity of our approach, which is so far restricted to models with nonglassy behavior, by replica calculations for a wide class of models as well as by simulations for a real data set.
ABSTRACT
We study learning of probability distributions characterized by an unknown symmetry direction. Based on an entropic performance measure and the variational method of statistical mechanics we develop exact upper and lower bounds on the scaled critical number of examples below which learning of the direction is impossible. The asymptotic tightness of the bounds suggests an asymptotically optimal method for learning nonsmooth distributions.
Subject(s)
Learning , Neural Networks, ComputerABSTRACT
We derive a mean-field algorithm for binary classification with gaussian processes that is based on the TAP approach originally proposed in statistical physics of disordered systems. The theory also yields an approximate leave-one-out estimator for the generalization error, which is computed with no extra computational cost. We show that from the TAP approach, it is possible to derive both a simpler "naive" mean-field theory and support vector machines (SVMs) as limiting cases. For both mean-field algorithms and support vector machines, simulation results for three small benchmark data sets are presented. They show that one may get state-of-the-art performance by using the leave-one-out estimator for model selection and the built-in leave-one-out estimators are extremely precise when compared to the exact leave-one-out estimate. The second result is taken as strong support for the internal consistency of the mean-field approach.
Subject(s)
Algorithms , Classification , Computer Simulation , Models, Neurological , Normal Distribution , Animals , Artificial Intelligence , Bayes Theorem , Brachyura/anatomy & histology , Diabetes Mellitus, Type 2/ethnology , Diabetes Mellitus, Type 2/genetics , Discrimination Learning/physiology , Female , Genetic Predisposition to Disease , Humans , Indians, North American/genetics , Likelihood Functions , Male , Sex Characteristics , Sound SpectrographyABSTRACT
The cDNA sequence encoding human beta-glucuronidase [Oshima, Kyle, Miller, Hoffmann, Powell, Grubb, Sly, Troplak, Guise and Gravel (1987) Proc. Natl. Acad. Sci. U.S.A. 84, 685-689] was expressed in baby hamster kidney (BHK) cells. After purification from the culture supernatant in one step by use of immunoaffinity chromatography, the biochemical properties of the enzyme were examined. With a pH optimum of 4.0, a Km of 1.3 mM and thermal stability up to 68 degrees C, this protein has characteristics very similar to those described for beta-glucuronidase from human placenta [Brot, Bell and Sly (1978) Biochemistry 17, 385-391. However, the recombinant product has several structural properties not previously reported for beta-glucuronidase isolated from natural sources. First, recombinant beta-glucuronidase is synthesized as a tetramer consisting of two disulphide-linked dimers. As can be inferred from the cDNA sequence, the enzyme possesses five cysteine residues after cleavage of the signal peptide. By introducing a C-terminal truncation, we eliminated the last cysteine at position 644. In the mutant, covalent linkage between two monomers is no longer observed, indicating that Cys-644 is involved in intermolecular disulphide-bond formation. The functional role of the disulphide bond remains elusive, as it was shown that (i) intracellular transport of the mutant is not impaired and (ii) it is still able to form an enzymically active tetramer. A second feature that has not previously been observed for beta-glucuronidase from any origin is the existence of two enzymically active species for recombinant beta-glucuronidase, when examined by gel filtration on a TSK 3000 column. With apparent molecular masses of 380 kDa and 190 kDa we propose that they represent tetramers and dimers respectively. Partial N-terminal sequencing and electrophoresis under denaturing conditions revealed that the dimers consist of subunits that have been proteolytically processed at their C-terminus losing 3-4 kDa in peptide mass. Controlled proteolysis demonstrates that the enzyme's overall protein backbone as well as its activity are resistant to a number of proteases. Only the C-terminal portion is susceptible to protease action, and the disulphide-linked form is readily converted into non-disulphide-bonded subunits. Pulse-chase analysis shows that human beta-glucuronidase remaining intracellular in BHK cells after synthesis undergoes a similar proteolytic processing event, i.e. a reduction in mass of 3-4 kDa.(ABSTRACT TRUNCATED AT 400 WORDS)