Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
IEEE Trans Pattern Anal Mach Intell ; 46(7): 4843-4849, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38265902

RESUMO

This paper studies a new curve-fitting approach to data on Riemannian manifolds. We define a principal curve based on a mixture model for observations and unobserved latent variables and propose a new algorithm to estimate the principal curve for given data points on Riemannian manifolds.

2.
J Classif ; : 1-25, 2023 Jun 12.
Artigo em Inglês | MEDLINE | ID: mdl-37359508

RESUMO

This study develops a new clustering method for high-dimensional zero-inflated time series data. The proposed method is based on thick-pen transform (TPT), in which the basic idea is to draw along the data with a pen of a given thickness. Since TPT is a multi-scale visualization technique, it provides some information on the temporal tendency of neighborhood values. We introduce a modified TPT, termed 'ensemble TPT (e-TPT)', to enhance the temporal resolution of zero-inflated time series data that is crucial for clustering them efficiently. Furthermore, this study defines a modified similarity measure for zero-inflated time series data considering e-TPT and proposes an efficient iterative clustering algorithm suitable for the proposed measure. Finally, the effectiveness of the proposed method is demonstrated by simulation experiments and two real datasets: step count data and newly confirmed COVID-19 case data.

3.
IEEE Trans Pattern Anal Mach Intell ; 43(6): 2165-2171, 2021 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-32956037

RESUMO

This paper presents a new approach for dimension reduction of data observed on spherical surfaces. Several dimension reduction techniques have been developed in recent years for non-euclidean data analysis. As a pioneer work, (Hauberg 2016) attempted to implement principal curves on Riemannian manifolds. However, this approach uses approximations to process data on Riemannian manifolds, resulting in distorted results. This study proposes a new approach to project data onto a continuous curve to construct principal curves on spherical surfaces. Our approach lies in the same line of (Hastie and Stuetzle et al. 1989) that proposed principal curves for data on euclidean space. We further investigate the stationarity of the proposed principal curves that satisfy the self-consistency on spherical surfaces. The results on the real data analysis and simulation examples show promising empirical characteristics of the proposed approach.

4.
Biometrics ; 77(1): 293-304, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-32150282

RESUMO

This paper considers the clustering problem of physical step count data recorded on wearable devices. Clustering step data give an insight into an individual's activity status and further provide the groundwork for health-related policies. However, classical methods, such as K-means clustering and hierarchical clustering, are not suitable for step count data that are typically high-dimensional and zero-inflated. This paper presents a new clustering method for step data based on a novel combination of ensemble clustering and binning. We first construct multiple sets of binned data by changing the size and starting position of the bin, and then merge the clustering results from the binned data using a voting method. The advantage of binning, as a critical component, is that it substantially reduces the dimension of the original data while preserving the essential characteristics of the data. As a result, combining clustering results from multiple binned data can provide an improved clustering result that reflects both local and global structures of the data. Simulation studies and real data analysis were carried out to evaluate the empirical performance of the proposed method and demonstrate its general utility.


Assuntos
Algoritmos , Análise por Conglomerados , Simulação por Computador
5.
Stat Methods Med Res ; 29(11): 3205-3217, 2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-32368950

RESUMO

This paper presents a new model-based generalized functional clustering method for discrete longitudinal data, such as multivariate binomial and Poisson distributed data. For this purpose, we propose a multivariate functional principal component analysis (MFPCA)-based clustering procedure for a latent multivariate Gaussian process instead of the original functional data directly. The main contribution of this study is two-fold: modeling of discrete longitudinal data with the latent multivariate Gaussian process and developing of a clustering algorithm based on MFPCA coupled with the latent multivariate Gaussian process. Numerical experiments, including real data analysis and a simulation study, demonstrate the promising empirical properties of the proposed approach.


Assuntos
Algoritmos , Análise por Conglomerados , Simulação por Computador , Análise Multivariada , Distribuição Normal
6.
J Appl Stat ; 47(11): 1957-1969, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-35707571

RESUMO

Dynamic principal component analysis (DPCA), also known as frequency domain principal component analysis, has been developed by Brillinger [Time Series: Data Analysis and Theory, Vol. 36, SIAM, 1981] to decompose multivariate time-series data into a few principal component series. A primary advantage of DPCA is its capability of extracting essential components from the data by reflecting the serial dependence of them. It is also used to estimate the common component in a dynamic factor model, which is frequently used in econometrics. However, its beneficial property cannot be utilized when missing values are present, which should not be simply ignored when estimating the spectral density matrix in the DPCA procedure. Based on a novel combination of conventional DPCA and self-consistency concept, we propose a DPCA method when missing values are present. We demonstrate the advantage of the proposed method over some existing imputation methods through the Monte Carlo experiments and real data analysis.

7.
Springerplus ; 5(1): 2016, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27942428

RESUMO

This paper considers an improvement of empirical mode decomposition (EMD) in the presence of missing data. EMD has been widely used to decompose nonlinear and nonstationary signals into some components according to intrinsic frequency called intrinsic mode functions. However, the conventional EMD may not be efficient when missing values are present. This paper proposes a modified EMD procedure based on a novel combination of empirical mode decomposition and self-consistency concept. The self-consistency provides an effective imputation method of missing data, and hence, the proposed EMD procedure produces stable decomposition results. Simulation studies and the image analysis demonstrate that the proposed method produces substantially effective results.

8.
Asian-Australas J Anim Sci ; 28(6): 771-81, 2015 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-25925054

RESUMO

Thoroughbred, a relatively recent horse breed, is best known for its use in horse racing. Although myostatin (MSTN) variants have been reported to be highly associated with horse racing performance, the trait is more likely to be polygenic in nature. The purpose of this study was to identify genetic variants strongly associated with racing performance by using estimated breeding value (EBV) for race time as a phenotype. We conducted a two-stage genome-wide association study to search for genetic variants associated with the EBV. In the first stage of genome-wide association study, a relatively large number of markers (~54,000 single-nucleotide polymorphisms, SNPs) were evaluated in a small number of samples (240 horses). In the second stage, a relatively small number of markers identified to have large effects (170 SNPs) were evaluated in a much larger number of samples (1,156 horses). We also validated the SNPs related to MSTN known to have large effects on racing performance and found significant associations in the stage two analysis, but not in stage one. We identified 28 significant SNPs related to 17 genes. Among these, six genes have a function related to myogenesis and five genes are involved in muscle maintenance. To our knowledge, these genes are newly reported for the genetic association with racing performance of Thoroughbreds. It complements a recent horse genome-wide association studies of racing performance that identified other SNPs and genes as the most significant variants. These results will help to expand our knowledge of the polygenic nature of racing performance in Thoroughbreds.

9.
Asian-Australas J Anim Sci ; 27(12): 1678-83, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25358359

RESUMO

This study considers a problem of genomic selection (GS) for adjacent genetic markers of Yorkshire pigs which are typically correlated. The GS has been widely used to efficiently estimate target variables such as molecular breeding values using markers across the entire genome. Recently, GS has been applied to animals as well as plants, especially to pigs. For efficient selection of variables with specific traits in pig breeding, it is required that any such variable selection retains some properties: i) it produces a simple model by identifying insignificant variables; ii) it improves the accuracy of the prediction of future data; and iii) it is feasible to handle high-dimensional data in which the number of variables is larger than the number of observations. In this paper, we applied several variable selection methods including least absolute shrinkage and selection operator (LASSO), fused LASSO and elastic net to data with 47K single nucleotide polymorphisms and litter size for 519 observed sows. Based on experiments, we observed that the fused LASSO outperforms other approaches.

10.
BMB Rep ; 46(6): 310-5, 2013 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-23790974

RESUMO

The gene order on the X chromosome of eutherians is generally highly conserved, although an increase in the rate of rearrangement has been reported in the rodent lineage. Conservation of the X chromosome is thought to be caused by selection related to maintenance of dosage compensation. However, we herein reveal that the cattle (Btau4.0) lineage has experienced a strong increase in the rate of X-chromosome rearrangement, much stronger than that previously reported for rodents. We also show that this increase is not matched by a similar increase on the autosomes and cannot be explained by assembly errors. Furthermore, we compared the difference in two cattle genome assemblies: Btau4.0 and Btau6.0 (Bos taurus UMD3.1). The results showed a discrepancy between Btau4.0 and Btau6.0 cattle assembly version data, and we believe that Btau6.0 cattle assembly version data are not more reliable than Btau4.0.


Assuntos
Evolução Biológica , Cromossomo X , Animais , Bovinos , Mapeamento Cromossômico/veterinária , Ligação Genética , Genoma , Humanos
11.
IEEE Trans Image Process ; 13(6): 773-81, 2004 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-15648868

RESUMO

In this paper, we focus on denoising images for which observations are equally spaced except around the boundaries which are irregular. Such images are very common in many fields, for example in geophysics. The advantages of adding a low-order polynomial term when implementing a wavelet regression for such images are presented. Besides removing the classical restriction of having a dyadic of number of observations, this strategy reduces the bias at the edges without significantly increasing the risk. In addition, this method is simple to implement, fast and efficient. Its utility is illustrated with simulation studies and a real example.


Assuntos
Algoritmos , Inteligência Artificial , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Análise Numérica Assistida por Computador , Reconhecimento Automatizado de Padrão/métodos , Armazenamento e Recuperação da Informação/métodos , Análise de Regressão , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Processamento de Sinais Assistido por Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...