Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
BMC Syst Biol ; 10(1): 72, 2016 08 11.
Artigo em Inglês | MEDLINE | ID: mdl-27516087

RESUMO

BACKGROUND: Canonical correlation analysis (CCA) is a multivariate statistical method which describes the associations between two sets of variables. The objective is to find linear combinations of the variables in each data set having maximal correlation. In genomics, CCA has become increasingly important to estimate the associations between gene expression data and DNA copy number change data. The identification of such associations might help to increase our understanding of the development of diseases such as cancer. However, these data sets are typically high-dimensional, containing a lot of variables relative to the number of objects. Moreover, the data sets might contain atypical observations since it is likely that objects react differently to treatments. We discuss a method for Robust Sparse CCA, thereby providing a solution to both issues. Sparse estimation produces canonical vectors with some of their elements estimated as exactly zero. As such, their interpretability is improved. Robust methods can cope with atypical observations in the data. RESULTS: We illustrate the good performance of the Robust Sparse CCA method by several simulation studies and three biometric examples. Robust Sparse CCA considerably outperforms its main alternatives in (1) correctly detecting the main associations between the data sets, in (2) accurately estimating these associations, and in (3) detecting outliers. CONCLUSIONS: Robust Sparse CCA delivers interpretable canonical vectors, while at the same time coping with outlying observations. The proposed method is able to describe the associations between high-dimensional data sets, which are nowadays commonplace in genomics. Furthermore, the Robust Sparse CCA method allows to characterize outliers.


Assuntos
Biologia Computacional/métodos , Algoritmos , Neoplasias da Mama/genética
2.
Biom J ; 57(5): 834-51, 2015 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-26147637

RESUMO

Canonical correlation analysis (CCA) describes the associations between two sets of variables by maximizing the correlation between linear combinations of the variables in each dataset. However, in high-dimensional settings where the number of variables exceeds the sample size or when the variables are highly correlated, traditional CCA is no longer appropriate. This paper proposes a method for sparse CCA. Sparse estimation produces linear combinations of only a subset of variables from each dataset, thereby increasing the interpretability of the canonical variates. We consider the CCA problem from a predictive point of view and recast it into a regression framework. By combining an alternating regression approach together with a lasso penalty, we induce sparsity in the canonical vectors. We compare the performance with other sparse CCA techniques in different simulation settings and illustrate its usefulness on a genomic dataset.


Assuntos
Biometria/métodos , Algoritmos , Genômica , Análise de Regressão
3.
Biometrics ; 68(1): 31-44, 2012 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-21668906

RESUMO

Generalized linear models are a widely used method to obtain parametric estimates for the mean function. They have been further extended to allow the relationship between the mean function and the covariates to be more flexible via generalized additive models. However, the fixed variance structure can in many cases be too restrictive. The extended quasilikelihood (EQL) framework allows for estimation of both the mean and the dispersion/variance as functions of covariates. As for other maximum likelihood methods though, EQL estimates are not resistant to outliers: we need methods to obtain robust estimates for both the mean and the dispersion function. In this article, we obtain functional estimates for the mean and the dispersion that are both robust and smooth. The performance of the proposed method is illustrated via a simulation study and some real data examples.


Assuntos
Algoritmos , Modelos Biológicos , Modelos Estatísticos , Simulação por Computador
4.
Sensors (Basel) ; 9(6): 4211-29, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-22408522

RESUMO

A well known problem for precise positioning in real environments is the presence of outliers in the measurement sample. Its importance is even bigger in ultrasound based systems since this technology needs a direct line of sight between emitters and receivers. Standard techniques for outlier detection in range based systems do not usually employ robust algorithms, failing when multiple outliers are present. The direct application of standard robust regression algorithms fails in static positioning (where only the current measurement sample is considered) in real ultrasound based systems mainly due to the limited number of measurements and the geometry effects. This paper presents a new robust algorithm, called RoPEUS, based on MM estimation, that follows a typical two-step strategy: 1) a high breakdown point algorithm to obtain a clean sample, and 2) a refinement algorithm to increase the accuracy of the solution. The main modifications proposed to the standard MM robust algorithm are a built in check of partial solutions in the first step (rejecting bad geometries) and the off-line calculation of the scale of the measurements. The algorithm is tested with real samples obtained with the 3D-LOCUS ultrasound localization system in an ideal environment without obstacles. These measurements are corrupted with typical outlying patterns to numerically evaluate the algorithm performance with respect to the standard parity space algorithm. The algorithm proves to be robust under single or multiple outliers, providing similar accuracy figures in all cases.

5.
Br J Psychiatry ; 190: 293-8, 2007 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-17401034

RESUMO

BACKGROUND: Low socio-economic status is associated with a higher prevalence of depression, but it is not yet known whether change in socio-economic status leads to a change in rates of depression. AIMS: To assess whether longitudinal change in socio-economic factors affects change of depression level. METHOD: In a prospective cohort study using the annual Belgian Household Panel Survey (1992-1999), depression was assessed using the Global Depression Scale. Socio-economic factors were assessed with regard to material standard of living, education, employment status and social relationships. RESULTS: A lowering in material standard of living between annual waves was associated with increases in depressive symptoms and caseness of major depression. Life circumstances also influenced depression. Ceasing to cohabit with a partner increased depressive symptoms and caseness, and improvement in circumstances reduced them; the negative effects were stronger than the positive ones. CONCLUSIONS: The study showed a clear relationship between worsening socio-economic circumstances and depression.


Assuntos
Transtorno Depressivo/psicologia , Fatores Socioeconômicos , Adolescente , Adulto , Bélgica/epidemiologia , Transtorno Depressivo/epidemiologia , Transtorno Depressivo/etiologia , Feminino , Nível de Saúde , Humanos , Estudos Longitudinais , Masculino , Pessoa de Meia-Idade , Prevalência , Escalas de Graduação Psiquiátrica/estatística & dados numéricos , Fatores de Risco
6.
Biometrics ; 62(4): 972-9, 2006 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-17156270

RESUMO

In biostatistical practice, it is common to use information criteria as a guide for model selection. We propose new versions of the focused information criterion (FIC) for variable selection in logistic regression. The FIC gives, depending on the quantity to be estimated, possibly different sets of selected variables. The standard version of the FIC measures the mean squared error of the estimator of the quantity of interest in the selected model. In this article, we propose more general versions of the FIC, allowing other risk measures such as the one based on L(p) error. When prediction of an event is important, as is often the case in medical applications, we construct an FIC using the error rate as a natural risk measure. The advantages of using an information criterion which depends on both the quantity of interest and the selected risk measure are illustrated by means of a simulation study and application to a study on diabetic retinopathy.


Assuntos
Modelos Logísticos , Biometria , Ensaios Clínicos como Assunto/estatística & dados numéricos , Interpretação Estatística de Dados , Bases de Dados Factuais , Retinopatia Diabética/epidemiologia , Feminino , Humanos , Masculino , Risco , Wisconsin/epidemiologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...