Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 25
Filter
Add more filters










Publication year range
1.
Neural Netw ; 166: 446-458, 2023 Sep.
Article in English | MEDLINE | ID: mdl-37566955

ABSTRACT

Neural architecture search (NAS) is a framework for automating the design process of a neural network structure. While the recent one-shot approaches have reduced the search cost, there still exists an inherent trade-off between cost and performance. It is important to appropriately stop the search and further reduce the high cost of NAS. Meanwhile, the differentiable architecture search (DARTS), a typical one-shot approach, is known to suffer from overfitting. Heuristic early-stopping strategies have been proposed to overcome such performance degradation. In this paper, we propose a more versatile and principled early-stopping criterion on the basis of the evaluation of a gap between expectation values of generalisation errors of the previous and current search steps with respect to the architecture parameters. The stopping threshold is automatically determined at each search epoch without cost. In numerical experiments, we demonstrate the effectiveness of the proposed method. We stop the one-shot NAS algorithms and evaluate the acquired architectures on the benchmark datasets: NAS-Bench-201 and NATS-Bench. Our algorithm is shown to reduce the cost of the search process while maintaining a high performance.


Subject(s)
Algorithms , Neural Networks, Computer , Deep Learning , Machine Learning
2.
Neural Netw ; 164: 731-741, 2023 Jul.
Article in English | MEDLINE | ID: mdl-37032243

ABSTRACT

In domain adaptation, when there is a large distance between the source and target domains, the prediction performance will degrade. Gradual domain adaptation is one of the solutions to such an issue, assuming that we have access to intermediate domains, which shift gradually from the source to the target domain. In previous works, it was assumed that the number of samples in the intermediate domains was sufficiently large; hence, self-training was possible without the need for labeled data. If the number of accessible intermediate domains is restricted, the distances between domains become large, and self-training will fail. Practically, the cost of samples in intermediate domains will vary, and it is natural to consider that the closer an intermediate domain is to the target domain, the higher the cost of obtaining samples from the intermediate domain is. To solve the trade-off between cost and accuracy, we propose a framework that combines multifidelity and active domain adaptation. The effectiveness of the proposed method is evaluated by experiments with real-world datasets.


Subject(s)
Cost-Benefit Analysis
3.
Neural Comput ; 35(1): 82-103, 2022 Dec 14.
Article in English | MEDLINE | ID: mdl-36417591

ABSTRACT

We propose a nonlinear probabilistic generative model of Koopman mode decomposition based on an unsupervised gaussian process. Existing data-driven methods for Koopman mode decomposition have focused on estimating the quantities specified by Koopman mode decomposition: eigenvalues, eigenfunctions, and modes. Our model enables the simultaneous estimation of these quantities and latent variables governed by an unknown dynamical system. Furthermore, we introduce an efficient strategy to estimate the parameters of our model by low-rank approximations of covariance matrices. Applying the proposed model to both synthetic data and a real-world epidemiological data set, we show that various analyses are available using the estimated parameters.

4.
Neural Comput ; 34(12): 2432-2466, 2022 11 08.
Article in English | MEDLINE | ID: mdl-36283052

ABSTRACT

Domain adaptation aims to transfer knowledge of labeled instances obtained from a source domain to a target domain to fill the gap between the domains. Most domain adaptation methods assume that the source and target domains have the same dimensionality. Methods that are applicable when the number of features is different in each domain have rarely been studied, especially when no label information is given for the test data obtained from the target domain. In this letter, it is assumed that common features exist in both domains and that extra (new additional) features are observed in the target domain; hence, the dimensionality of the target domain is higher than that of the source domain. To leverage the homogeneity of the common features, the adaptation between these source and target domains is formulated as an optimal transport (OT) problem. In addition, a learning bound in the target domain for the proposed OT-based method is derived. The proposed algorithm is validated using both simulated and real-world data.


Subject(s)
Algorithms , Learning
5.
Neural Comput ; 34(9): 1944-1977, 2022 08 16.
Article in English | MEDLINE | ID: mdl-35896163

ABSTRACT

Many machine learning methods assume that the training and test data follow the same distribution. However, in the real world, this assumption is often violated. In particular, the marginal distribution of the data changes, called covariate shift, is one of the most important research topics in machine learning. We show that the well-known family of covariate shift adaptation methods is unified in the framework of information geometry. Furthermore, we show that parameter search for a geometrically generalized covariate shift adaptation method can be achieved efficiently. Numerical experiments show that our generalization can achieve better performance than the existing methods it encompasses.


Subject(s)
Algorithms , Machine Learning
6.
Neural Netw ; 149: 29-39, 2022 May.
Article in English | MEDLINE | ID: mdl-35183852

ABSTRACT

A large number of neurons form cell assemblies that process information in the brain. Recent developments in measurement technology, one of which is calcium imaging, have made it possible to study cell assemblies. In this study, we aim to extract cell assemblies from calcium imaging data. We propose a clustering approach based on non-negative matrix factorization (NMF). The proposed approach first obtains a similarity matrix between neurons by NMF and then performs spectral clustering on it. The application of NMF entails the problem of model selection. The number of bases in NMF affects the result considerably, and a suitable selection method is yet to be established. We attempt to resolve this problem by model averaging with a newly defined estimator based on NMF. Experiments on simulated data suggest that the proposed approach is superior to conventional correlation-based clustering methods over a wide range of sampling rates. We also analyzed calcium imaging data of sleeping/waking mice and the results suggest that the size of the cell assembly depends on the degree and spatial extent of slow wave generation in the cerebral cortex.


Subject(s)
Algorithms , Calcium , Animals , Cluster Analysis , Diagnostic Imaging , Mice , Neurons
7.
Entropy (Basel) ; 23(5)2021 Apr 25.
Article in English | MEDLINE | ID: mdl-33923103

ABSTRACT

The asymmetric skew divergence smooths one of the distributions by mixing it, to a degree determined by the parameter λ, with the other distribution. Such divergence is an approximation of the KL divergence that does not require the target distribution to be absolutely continuous with respect to the source distribution. In this paper, an information geometric generalization of the skew divergence called the α-geodesical skew divergence is proposed, and its properties are studied.

8.
Sci Rep ; 10(1): 21790, 2020 12 11.
Article in English | MEDLINE | ID: mdl-33311555

ABSTRACT

Determination of crystal system and space group in the initial stages of crystal structure analysis forms a bottleneck in material science workflow that often requires manual tuning. Herein we propose a machine-learning (ML)-based approach for crystal system and space group classification based on powder X-ray diffraction (XRD) patterns as a proof of concept using simulated patterns. Our tree-ensemble-based ML model works with nearly or over 90% accuracy for crystal system classification, except for triclinic cases, and with 88% accuracy for space group classification with five candidates. We also succeeded in quantifying empirical knowledge vaguely shared among experts, showing the possibility for data-driven discovery of unrecognised characteristics embedded in experimental data by using an interpretable ML approach.

9.
Neural Comput ; 32(10): 1901-1935, 2020 10.
Article in English | MEDLINE | ID: mdl-32795231

ABSTRACT

Principal component analysis (PCA) is a widely used method for data processing, such as for dimension reduction and visualization. Standard PCA is known to be sensitive to outliers, and various robust PCA methods have been proposed. It has been shown that the robustness of many statistical methods can be improved using mode estimation instead of mean estimation, because mode estimation is not significantly affected by the presence of outliers. Thus, this study proposes a modal principal component analysis (MPCA), which is a robust PCA method based on mode estimation. The proposed method finds the minor component by estimating the mode of the projected data points. As a theoretical contribution, probabilistic convergence property, influence function, finite-sample breakdown point, and its lower bound for the proposed MPCA are derived. The experimental results show that the proposed method has advantages over conventional methods.

10.
Cereb Cortex ; 30(7): 3977-3990, 2020 06 01.
Article in English | MEDLINE | ID: mdl-32037455

ABSTRACT

Sleep exerts modulatory effects on the cerebral cortex. Whether sleep modulates local connectivity in the cortex or only individual neural activity, however, is poorly understood. Here we investigated functional connectivity, that is, covarying activity between neurons, during spontaneous sleep-wake states and during and after sleep deprivation using calcium imaging of identified excitatory/inhibitory neurons in the motor cortex. Functional connectivity was estimated with a statistical learning approach glasso and quantified by "the probability of establishing connectivity (sparse/dense)" and "the strength of the established connectivity (weak/strong)." Local cortical connectivity was sparse in non-rapid eye movement (NREM) sleep and dense in REM sleep, which was similar in both excitatory and inhibitory neurons. The overall mean strength of the connectivity did not differ largely across spontaneous sleep-wake states. Sleep deprivation induced strong excitatory/inhibitory and dense inhibitory, but not excitatory, connectivity. Subsequent NREM sleep after sleep deprivation exhibited weak excitatory/inhibitory, sparse excitatory, and dense inhibitory connectivity. These findings indicate that sleep-wake states modulate local cortical connectivity, and the modulation is large and compensatory for stability of local circuits during the homeostatic control of sleep, which contributes to plastic changes in neural information flow.


Subject(s)
Cerebral Cortex/physiology , Sleep Deprivation/physiopathology , Sleep/physiology , Wakefulness/physiology , Animals , Cerebral Cortex/metabolism , Cerebral Cortex/pathology , Electroencephalography , Electromyography , Homeostasis , Mice , Microscopy, Confocal , Motor Cortex/metabolism , Motor Cortex/pathology , Motor Cortex/physiology , Neural Pathways/metabolism , Neural Pathways/pathology , Neural Pathways/physiology , Optical Imaging , Sleep Deprivation/metabolism , Sleep Deprivation/pathology , Sleep Stages/physiology , Sleep, REM/physiology
11.
Sci Rep ; 9(1): 1526, 2019 Feb 06.
Article in English | MEDLINE | ID: mdl-30728390

ABSTRACT

We propose a method to accelerate small-angle scattering experiments by exploiting spatial correlation in two-dimensional data. We applied kernel density estimation to the average of a hundred short scans and evaluated noise reduction effects of kernel density estimation (smoothing). Although there is no advantage of using smoothing for isotropic data due to the powerful noise reduction effect of radial averaging, smoothing with a statistically and physically appropriate kernel can shorten measurement time by less than half to obtain sector averages with comparable statistical quality to that of sector averages without smoothing. This benefit will encourage researchers not to use full radial average on anisotropic data sacrificing anisotropy for statistical quality. We also confirmed that statistically reasonable estimation of measurement time is feasible on site by evaluating how intensity variances improve with accumulating counts. The noise reduction effect of smoothing will bring benefits to a wide range of applications from efficient use of beamtime at laboratories and large experimental facilities to stroboscopic measurements suffering low statistical quality.

12.
Neural Netw ; 108: 172-191, 2018 Dec.
Article in English | MEDLINE | ID: mdl-30199783

ABSTRACT

Plasticity is one of the most important properties of the nervous system, which enables animals to adjust their behavior to the ever-changing external environment. Changes in synaptic efficacy between neurons constitute one of the major mechanisms of plasticity. Therefore, estimation of neural connections is crucial for investigating information processing in the brain. Although many analysis methods have been proposed for this purpose, most of them suffer from one or all the following mathematical difficulties: (1) only partially observed neural activity is available; (2) correlations can include both direct and indirect pseudo-interactions; and (3) biological evidence that a neuron typically has only one type of connection (excitatory or inhibitory) should be considered. To overcome these difficulties, a novel probabilistic framework for estimating neural connections from partially observed spikes is proposed in this paper. First, based on the property of a sum of random variables, the proposed method estimates the influence of unobserved neurons on observed neurons and extracts only the correlations among observed neurons. Second, the relationship between pseudo-correlations and target connections is modeled by neural propagation in a multiplicative manner. Third, a novel information-theoretic framework is proposed for estimating neuron types. The proposed method was validated using spike data generated by artificial neural networks. In addition, it was applied to multi-unit data recorded from the CA1 area of a rat's hippocampus. The results confirmed that our estimates are consistent with previous reports. These findings indicate that the proposed method is useful for extracting crucial interactions in neural signals as well as in other multi-probed point process data.


Subject(s)
Action Potentials , Nerve Net , Neural Networks, Computer , Action Potentials/physiology , Animals , Hippocampus/physiology , Nerve Net/physiology , Neurons/physiology , Rats
13.
Neural Netw ; 108: 68-82, 2018 Dec.
Article in English | MEDLINE | ID: mdl-30173055

ABSTRACT

Electroencephalography (EEG) is a non-invasive brain imaging technique that describes neural electrical activation with good temporal resolution. Source localization is required for clinical and functional interpretations of EEG signals, and most commonly is achieved via the dipole model; however, the number of dipoles in the brain should be determined for a reasonably accurate interpretation. In this paper, we propose a dipole source localization (DSL) method that adaptively estimates the dipole number by using a novel information criterion. Since the particle filtering process is nonparametric, it is not clear whether conventional information criteria such as Akaike's information criterion (AIC) and Bayesian information criterion (BIC) can be applied. In the proposed method, multiple particle filters run in parallel, each of which respectively estimates the dipole locations and moments, with the assumption that the dipole number is known and fixed; at every time step, the most predictive particle filter is selected by using an information criterion tailored for particle filters. We tested the proposed information criterion first through experiments on artificial datasets; these experiments supported the hypothesis that the proposed information criterion would outperform both AIC and BIC. We then analyzed real human EEG datasets collected during an auditory short-term memory task using the proposed method. We found that the alpha-band dipoles were localized to the right and left auditory areas during the auditory short-term memory task, which is consistent with previous physiological findings. These analyses suggest the proposed information criterion can work well in both model and real-world situations.


Subject(s)
Auditory Perception/physiology , Brain/physiology , Electroencephalography/methods , Adult , Algorithms , Bayes Theorem , Brain Mapping/methods , Female , Humans
14.
Sci Rep ; 8(1): 8111, 2018 05 25.
Article in English | MEDLINE | ID: mdl-29802305

ABSTRACT

Analyses of volcanic ash are typically performed either by qualitatively classifying ash particles by eye or by quantitatively parameterizing its shape and texture. While complex shapes can be classified through qualitative analyses, the results are subjective due to the difficulty of categorizing complex shapes into a single class. Although quantitative analyses are objective, selection of shape parameters is required. Here, we applied a convolutional neural network (CNN) for the classification of volcanic ash. First, we defined four basal particle shapes (blocky, vesicular, elongated, rounded) generated by different eruption mechanisms (e.g., brittle fragmentation), and then trained the CNN using particles composed of only one basal shape. The CNN could recognize the basal shapes with over 90% accuracy. Using the trained network, we classified ash particles composed of multiple basal shapes based on the output of the network, which can be interpreted as a mixing ratio of the four basal shapes. Clustering of samples by the averaged probabilities and the intensity is consistent with the eruption type. The mixing ratio output by the CNN can be used to quantitatively classify complex shapes in nature without categorizing forcibly and without the need for shape parameters, which may lead to a new taxonomy.

15.
Sci Rep ; 7(1): 6129, 2017 07 21.
Article in English | MEDLINE | ID: mdl-28733582

ABSTRACT

The down-dip limit of the seismogenic zone and up-dip and down-dip limits of the deep low-frequency tremors in southwest Japan are clearly imaged by the hypocentre distribution. Previous studies using smooth constraints in inversion analyses estimated that long-term slow slip events (L-SSEs) beneath the Bungo Channel are distributed smoothly from the down-dip part of the seismogenic zone to the up-dip part of the tremors. Here, we use fused regularisation, a type of sparse modelling suitable for detecting discontinuous changes in the model parameters to estimate the slip distribution of L-SSEs. The largest slip abruptly becomes zero at the down-dip limit of the seismogenic zone, is immediately reduced to half at the up-dip limit of the tremors, and becomes zero near its down-dip limit. Such correspondences imply that some thresholds exist in the generation processes for both tremors and SSEs. Hence, geodetic data inversion with sparse modelling can detect such high resolution in the slip distribution.

16.
Neural Comput ; 29(7): 1838-1878, 2017 07.
Article in English | MEDLINE | ID: mdl-28410058

ABSTRACT

We propose a method for intrinsic dimension estimation. By fitting the power of distance from an inspection point and the number of samples included inside a ball with a radius equal to the distance, to a regression model, we estimate the goodness of fit. Then, by using the maximum likelihood method, we estimate the local intrinsic dimension around the inspection point. The proposed method is shown to be comparable to conventional methods in global intrinsic dimension estimation experiments. Furthermore, we experimentally show that the proposed method outperforms a conventional local dimension estimation method.

17.
PLoS One ; 12(1): e0169981, 2017.
Article in English | MEDLINE | ID: mdl-28076383

ABSTRACT

In a product market or stock market, different products or stocks compete for the same consumers or purchasers. We propose a method to estimate the time-varying transition matrix of the product share using a multivariate time series of the product share. The method is based on the assumption that each of the observed time series of shares is a stationary distribution of the underlying Markov processes characterized by transition probability matrices. We estimate transition probability matrices for every observation under natural assumptions. We demonstrate, on a real-world dataset of the share of automobiles, that the proposed method can find intrinsic transition of shares. The resulting transition matrices reveal interesting phenomena, for example, the change in flows between TOYOTA group and GM group for the fiscal year where TOYOTA group's sales beat GM's sales, which is a reasonable scenario.


Subject(s)
Algorithms , Automobiles , Commerce/statistics & numerical data , Consumer Behavior/statistics & numerical data , Statistics as Topic/methods , Automobiles/economics , Automobiles/statistics & numerical data , Humans , Markov Chains , Probability , Time Factors
18.
Neural Comput ; 28(12): 2687-2725, 2016 12.
Article in English | MEDLINE | ID: mdl-27626969

ABSTRACT

This study considers the common situation in data analysis when there are few observations of the distribution of interest or the target distribution, while abundant observations are available from auxiliary distributions. In this situation, it is natural to compensate for the lack of data from the target distribution by using data sets from these auxiliary distributions-in other words, approximating the target distribution in a subspace spanned by a set of auxiliary distributions. Mixture modeling is one of the simplest ways to integrate information from the target and auxiliary distributions in order to express the target distribution as accurately as possible. There are two typical mixtures in the context of information geometry: the [Formula: see text]- and [Formula: see text]-mixtures. The [Formula: see text]-mixture is applied in a variety of research fields because of the presence of the well-known expectation-maximazation algorithm for parameter estimation, whereas the [Formula: see text]-mixture is rarely used because of its difficulty of estimation, particularly for nonparametric models. The [Formula: see text]-mixture, however, is a well-tempered distribution that satisfies the principle of maximum entropy. To model a target distribution with scarce observations accurately, this letter proposes a novel framework for a nonparametric modeling of the [Formula: see text]-mixture and a geometrically inspired estimation algorithm. As numerical examples of the proposed framework, a transfer learning setup is considered. The experimental results show that this framework works well for three types of synthetic data sets, as well as an EEG real-world data set.

19.
Neural Netw ; 66: 64-78, 2015 Jun.
Article in English | MEDLINE | ID: mdl-25805366

ABSTRACT

An image super-resolution method from multiple observation of low-resolution images is proposed. The method is based on sub-pixel accuracy block matching for estimating relative displacements of observed images, and sparse signal representation for estimating the corresponding high-resolution image, where correspondence between high- and low-resolution images are modeled by a certain degradation process. Relative displacements of small patches of observed low-resolution images are accurately estimated by a computationally efficient block matching method. The matching scores of the block matching are used to select a subset of low-resolution patches for reconstructing a high-resolution patch, that is, an adaptive selection of informative low-resolution images is realized. The proposed method is shown to perform comparable or superior to conventional super-resolution methods through experiments using various images.


Subject(s)
Algorithms , Image Enhancement/methods , Pattern Recognition, Automated/methods
20.
Neural Comput ; 26(9): 2074-101, 2014 Sep.
Article in English | MEDLINE | ID: mdl-24922504

ABSTRACT

Clustering is a representative of unsupervised learning and one of the important approaches in exploratory data analysis. By its very nature, clustering without strong assumption on data distribution is desirable. Information-theoretic clustering is a class of clustering methods that optimize information-theoretic quantities such as entropy and mutual information. These quantities can be estimated in a nonparametric manner, and information-theoretic clustering algorithms are capable of capturing various intrinsic data structures. It is also possible to estimate information-theoretic quantities using a data set with sampling weight for each datum. Assuming the data set is sampled from a certain cluster and assigning different sampling weights depending on the clusters, the cluster-conditional information-theoretic quantities are estimated. In this letter, a simple iterative clustering algorithm is proposed based on a nonparametric estimator of the log likelihood for weighted data sets. The clustering algorithm is also derived from the principle of conditional entropy minimization with maximum entropy regularization. The proposed algorithm does not contain a tuning parameter. The algorithm is experimentally shown to be comparable to or outperform conventional nonparametric clustering methods.

SELECTION OF CITATIONS
SEARCH DETAIL
...