Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
1.
J Multivar Anal ; 171: 382-396, 2019 May.
Artigo em Inglês | MEDLINE | ID: mdl-31588153

RESUMO

By optimizing index functions against different outcomes, we propose a multivariate single-index model (SIM) for development of medical indices that simultaneously work with multiple outcomes. Fitting of a multivariate SIM is not fundamentally different from fitting a univariate SIM, as the former can be written as a sum of multiple univariate SIMs with appropriate indicator functions. What have not been carefully studied are the theoretical properties of the parameter estimators. Because of the lack of asymptotic results, no formal inference procedure has been made available for multivariate SIMs. In this paper, we examine the asymptotic properties of the multivariate SIM parameter estimators. We show that, under mild regularity conditions, estimators for the multivariate SIM parameters are indeed n-consistent and asymptotically normal. We conduct a simulation study to investigate the finite-sample performance of the corresponding estimation and inference procedures. To illustrate its use in practice, we construct an index measure of urine electrolyte markers for assessing the risk of hypertension in individual subjects.

2.
Stat Med ; 28(20): 2580-604, 2009 Sep 10.
Artigo em Inglês | MEDLINE | ID: mdl-19548299

RESUMO

In this article, we present a general procedure to analyze exchangeable binary data that may also be viewed as realizations of binomial mixtures. Our approach unifies existing models and is practical and computationally easy. Resulting from completely monotonic functions, we introduce a rich family of parametric parsimonious binomial mixtures, including the incomplete Beta-, Gamma-, Normal-, and Poisson-binomial, generalizing the Beta-binomial. We show that the family is closed under convex linear combinations, products, and composites. We also give the moments and the Markov property of the family. With such distributions, we can perform statistical inference on correlated binary data and, in particular, overdispersed data. We propose a regression procedure that generalizes logistic regression. We provide a forward model selection procedure. We run a small simulation to validate the inclusion of the binomial distribution. Finally, we apply the proposed procedure to analyze the 2, 4, 5-Trichlorophenoxyacetic acid and E2 data and compare the results with existing procedures.


Assuntos
Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/embriologia , Modelos Estatísticos , Ácido 2,4,5-Triclorofenoxiacético/efeitos adversos , Anormalidades Induzidas por Medicamentos , Algoritmos , Animais , Distribuição Binomial , Simulação por Computador , Morte Fetal , Funções Verossimilhança , Cadeias de Markov , Camundongos , Análise de Regressão
3.
IEEE Trans Pattern Anal Mach Intell ; 31(2): 288-305, 2009 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-19110494

RESUMO

Statistical depth functions provide from the "deepest" point a "center-outward ordering" of multidimensional data. In this sense, depth functions can measure the "extremeness" or "outlyingness" of a data point with respect to a given data set. Hence, they can detect outliers--observations that appear extreme relative to the rest of the observations. Of the various statistical depths, the spatial depth is especially appealing because of its computational efficiency and mathematical tractability. In this article, we propose a novel statistical depth, the kernelized spatial depth (KSD), which generalizes the spatial depth via positive definite kernels. By choosing a proper kernel, the KSD can capture the local structure of a data set while the spatial depth fails. We demonstrate this by the half-moon data and the ring-shaped data. Based on the KSD, we propose a novel outlier detection algorithm, by which an observation with a depth value less than a threshold is declared as an outlier. The proposed algorithm is simple in structure: the threshold is the only one parameter for a given kernel. It applies to a one-class learning setting, in which "normal" observations are given as the training data, as well as to a missing label scenario, where the training set consists of a mixture of normal observations and outliers with unknown labels. We give upper bounds on the false alarm probability of a depth-based detector. These upper bounds can be used to determine the threshold. We perform extensive experiments on synthetic data and data sets from real applications. The proposed outlier detector is compared with existing methods. The KSD outlier detector demonstrates a competitive performance.


Assuntos
Algoritmos , Inteligência Artificial , Interpretação de Imagem Assistida por Computador/métodos , Modelos Teóricos , Reconhecimento Automatizado de Padrão/métodos , Simulação por Computador
4.
BMC Bioinformatics ; 8 Suppl 7: S8, 2007 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-18047731

RESUMO

BACKGROUND: Mean-based clustering algorithms such as bisecting k-means generally lack robustness. Although componentwise median is a more robust alternative, it can be a poor center representative for high dimensional data. We need a new algorithm that is robust and works well in high dimensional data sets e.g. gene expression data. RESULTS: Here we propose a new robust divisive clustering algorithm, the bisecting k-spatialMedian, based on the statistical spatial depth. A new subcluster selection rule, Relative Average Depth, is also introduced. We demonstrate that the proposed clustering algorithm outperforms the componentwise-median-based bisecting k-median algorithm for high dimension and low sample size (HDLSS) data via applications of the algorithms on two real HDLSS gene expression data sets. When further applied on noisy real data sets, the proposed algorithm compares favorably in terms of robustness with the componentwise-median-based bisecting k-median algorithm. CONCLUSION: Statistical data depths provide an alternative way to find the "center" of multivariate data sets and are useful and robust for clustering.


Assuntos
Algoritmos , Interpretação Estatística de Dados , Perfilação da Expressão Gênica/métodos , Modelos Biológicos , Modelos Estatísticos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Simulação por Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...