Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
Add more filters










Database
Language
Publication year range
1.
Article in English | MEDLINE | ID: mdl-26356862

ABSTRACT

Understanding genetic differences among populations is one of the most important issues in population genetics. Genetic variations, e.g., single nucleotide polymorphisms, are used to characterize commonality and difference of individuals from various populations. This paper presents an efficient graph-based clustering framework which operates iteratively on the Neighbor-Joining (NJ) tree called the iNJclust algorithm. The framework uses well-known genetic measurements, namely the allele-sharing distance, the neighbor-joining tree, and the fixation index. The behavior of the fixation index is utilized in the algorithm's stopping criterion. The algorithm provides an estimated number of populations, individual assignments, and relationships between populations as outputs. The clustering result is reported in the form of a binary tree, whose terminal nodes represent the final inferred populations and the tree structure preserves the genetic relationships among them. The clustering performance and the robustness of the proposed algorithm are tested extensively using simulated and real data sets from bovine, sheep, and human populations. The result indicates that the number of populations within each data set is reasonably estimated, the individual assignment is robust, and the structure of the inferred population tree corresponds to the intrinsic relationships among populations within the data.


Subject(s)
Cluster Analysis , Genetics, Population/methods , Genomics/methods , Algorithms , Animals , Cattle , Computer Simulation , Databases, Genetic , Humans , Sheep , Software
2.
BMC Bioinformatics ; 12: 255, 2011 Jun 23.
Article in English | MEDLINE | ID: mdl-21699684

ABSTRACT

BACKGROUND: The ever increasing sizes of population genetic datasets pose great challenges for population structure analysis. The Tracy-Widom (TW) statistical test is widely used for detecting structure. However, it has not been adequately investigated whether the TW statistic is susceptible to type I error, especially in large, complex datasets. Non-parametric, Principal Component Analysis (PCA) based methods for resolving structure have been developed which rely on the TW test. Although PCA-based methods can resolve structure, they cannot infer ancestry. Model-based methods are still needed for ancestry analysis, but they are not suitable for large datasets. We propose a new structure analysis framework for large datasets. This includes a new heuristic for detecting structure and incorporation of the structure patterns inferred by a PCA method to complement STRUCTURE analysis. RESULTS: A new heuristic called EigenDev for detecting population structure is presented. When tested on simulated data, this heuristic is robust to sample size. In contrast, the TW statistic was found to be susceptible to type I error, especially for large population samples. EigenDev is thus better-suited for analysis of large datasets containing many individuals, in which spurious patterns are likely to exist and could be incorrectly interpreted as population stratification. EigenDev was applied to the iterative pruning PCA (ipPCA) method, which resolves the underlying subpopulations. This subpopulation information was used to supervise STRUCTURE analysis to infer patterns of ancestry at an unprecedented level of resolution. To validate the new approach, a bovine and a large human genetic dataset (3945 individuals) were analyzed. We found new ancestry patterns consistent with the subpopulations resolved by ipPCA. CONCLUSIONS: The EigenDev heuristic is robust to sampling and is thus superior for detecting structure in large datasets. The application of EigenDev to the ipPCA algorithm improves the estimation of the number of subpopulations and the individual assignment accuracy, especially for very large and complex datasets. Furthermore, we have demonstrated that the structure resolved by this approach complements parametric analysis, allowing a much more comprehensive account of population structure. The new version of the ipPCA software with EigenDev incorporated can be downloaded from http://www4a.biotec.or.th/GI/tools/ippca.


Subject(s)
Algorithms , Cattle/genetics , Population Groups/genetics , Principal Component Analysis , Animals , Artificial Intelligence , Genetics, Population , Genome, Human , Haplotypes , Humans
3.
IEEE Trans Biomed Eng ; 57(3): 616-25, 2010 Mar.
Article in English | MEDLINE | ID: mdl-19789097

ABSTRACT

This paper presents a spatiotemporal framework for estimating single-trial response latencies and amplitudes from evoked response magnetoencephalographic/electroencephalographic data. Spatial and temporal bases are employed to capture the aspects of the evoked response that are consistent across trials. Trial amplitudes are assumed independent but have the same underlying normal distribution with unknown mean and variance. The trial latency is assumed to be deterministic but unknown. We assume that the noise is spatially correlated with unknown covariance matrix. We introduce a generalized expectation-maximization algorithm called Trial Variability in Amplitude and Latency ( TriViAL) that computes the maximum likelihood (ML) estimates of the amplitudes, latencies, basis coefficients, and noise covariance matrix. The proposed approach also performs ML source localization by scanning the TriViAL algorithm over spatial bases corresponding to different locations on the cortical surface. Source locations are identified as the locations corresponding to large likelihood values. The effectiveness of the TriViAL algorithm is demonstrated using simulated data and human evoked response experiments. The localization performance is validated using tactile stimulation of the finger. The efficacy of the algorithm in estimating latency variability is shown using the known dependence of the M100 auditory response latency to stimulus tone frequency. We also demonstrate that estimation of response amplitude is improved when latency is included in the signal model.


Subject(s)
Electroencephalography/methods , Evoked Potentials/physiology , Magnetoencephalography/methods , Signal Processing, Computer-Assisted , Algorithms , Computer Simulation , Fingers/physiology , Humans , Reproducibility of Results
4.
IEEE Trans Biomed Eng ; 56(3): 633-45, 2009 Mar.
Article in English | MEDLINE | ID: mdl-19272883

ABSTRACT

A spatiotemporal framework for estimating trial-to-trial variability in evoked response (ER) data is presented. Spatial and temporal bases capture the aspects of the response that are consistent across trials, while the basis expansion coefficients represent the variable components of the response. We focus on the simplest case of constant spatiotemporal response shape and varying amplitude across trials. Two different constraints on the amplitude evolution are employed to effectively integrate the individual responses and improve robustness at low SNR. The linear dynamical system response constraint estimates the current trial amplitude as an unknown constant scaling of the estimate in the previous trial plus zero-mean Gaussian noise with unknown variance. The independent response constraint estimates response amplitudes across trials as independent Gaussian random variables having unknown mean and variance. We develop a generalized expectation-maximization algorithm to obtain the maximum-likelihood (ML) estimates of the signal waveform, noise covariance matrix, and unknown constraint parameters. ML source localization is achieved by scanning the likelihood over different sets of spatial bases. We demonstrate the variability estimation and source localization effectiveness of the proposed algorithms using both real and simulated ER data.


Subject(s)
Electroencephalography , Evoked Potentials , Magnetoencephalography , Signal Processing, Computer-Assisted , Algorithms , Brain/physiology , Brain Mapping , Computer Simulation , Humans , Linear Models , Models, Statistical , Normal Distribution
5.
IEEE Trans Biomed Eng ; 53(9): 1740-54, 2006 Sep.
Article in English | MEDLINE | ID: mdl-16941830

ABSTRACT

A new source model for representing spatially distributed neural activity is presented. The signal of interest is modeled as originating from a patch of cortex and is represented using a set of basis functions. Each cortical patch has its own set of bases, which allows representation of arbitrary source activity within the patch. This is in contrast to previously proposed cortical patch models which assume a specific distribution of activity within the patch. We present a procedure for designing bases that minimize the normalized mean squared representation error, averaged over different activity distributions within the patch. Extension of existing algorithms to the basis function framework is straightforward and is illustrated using linearly constrained minimum variance (LCMV) spatial filtering and maximum-likelihood signal estimation/generalized likelihood ratio test (ML/GLRT). The number of bases chosen for each patch determines a tradeoff between representation accuracy and the ability to differentiate between distinct patches. We propose choosing the minimum number of bases that satisfy a constraint on the normalized mean squared representation accuracy. A mismatch analysis for LCMV and ML/GLRT is presented to show that this is an appropriate strategy for choosing the number of bases. The effectiveness of the patch basis model is demonstrated using real and simulated evoked response data. We show that significant changes in performance occur as the number of basis functions varies, and that very good results are obtained by allowing modest representation error.


Subject(s)
Brain Mapping/methods , Cerebral Cortex/physiology , Diagnosis, Computer-Assisted/methods , Electroencephalography/methods , Evoked Potentials/physiology , Magnetoencephalography/methods , Models, Neurological , Algorithms , Anisotropy , Computer Simulation , Humans
SELECTION OF CITATIONS
SEARCH DETAIL
...