Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
Add more filters










Database
Language
Publication year range
1.
Stat Sci ; 26(1): 130-149, 2011 Feb 01.
Article in English | MEDLINE | ID: mdl-24089585

ABSTRACT

This paper presents a unified treatment of Gaussian process models that extends to data from the exponential dispersion family and to survival data. Our specific interest is in the analysis of data sets with predictors that have an a priori unknown form of possibly nonlinear associations to the response. The modeling approach we describe incorporates Gaussian processes in a generalized linear model framework to obtain a class of nonparametric regression models where the covariance matrix depends on the predictors. We consider, in particular, continuous, categorical and count responses. We also look into models that account for survival outcomes. We explore alternative covariance formulations for the Gaussian process prior and demonstrate the flexibility of the construction. Next, we focus on the important problem of selecting variables from the set of possible predictors and describe a general framework that employs mixture priors. We compare alternative MCMC strategies for posterior inference and achieve a computationally efficient and practical approach. We demonstrate performances on simulated and benchmark data sets.

2.
Cancer Inform ; 3: 19-28, 2007 Feb 05.
Article in English | MEDLINE | ID: mdl-19455232

ABSTRACT

In recent years, there has been an increased interest in using protein mass spectroscopy to identify molecular markers that discriminate diseased from healthy individuals. Existing methods are tailored towards classifying observations into nominal categories. Sometimes, however, the outcome of interest may be measured on an ordered scale. Ignoring this natural ordering results in some loss of information. In this paper, we propose a Bayesian model for the analysis of mass spectrometry data with ordered outcome. The method provides a unified approach for identifying relevant markers and predicting class membership. This is accomplished by building a stochastic search variable selection method within an ordinal outcome model. We apply the methodology to mass spectrometry data on ovarian cancer cases and healthy individuals. We also utilize wavelet-based techniques to remove noise from the mass spectra prior to analysis. We identify protein markers associated with being healthy, having low grade ovarian cancer, or being a high grade case. For comparison, we repeated the analysis using conventional classification procedures and found improved predictive accuracy with our method.

3.
Bioinformatics ; 22(18): 2262-8, 2006 Sep 15.
Article in English | MEDLINE | ID: mdl-16845144

ABSTRACT

MOTIVATION: A common task in microarray data analysis consists of identifying genes associated with a phenotype. When the outcomes of interest are censored time-to-event data, standard approaches assess the effect of genes by fitting univariate survival models. In this paper, we propose a Bayesian variable selection approach, which allows the identification of relevant markers by jointly assessing sets of genes. We consider accelerated failure time (AFT) models with log-normal and log-t distributional assumptions. A data augmentation approach is used to impute the failure times of censored observations and mixture priors are used for the regression coefficients to identify promising subsets of variables. The proposed method provides a unified procedure for the selection of relevant genes and the prediction of survivor functions. RESULTS: We demonstrate the performance of the method on simulated examples and on several microarray datasets. For the simulation study, we consider scenarios with large number of noisy variables and different degrees of correlation between the relevant and non-relevant (noisy) variables. We are able to identify the correct covariates and obtain good prediction of the survivor functions. For the microarray applications, some of our selected genes are known to be related to the diseases under study and a few are in agreement with findings from other researchers. AVAILABILITY: The Matlab code for implementing the Bayesian variable selection method may be obtained from the corresponding author. CONTACT: mvannucci@stat.tamu.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Algorithms , Artifacts , Gene Expression Profiling/methods , Models, Genetic , Oligonucleotide Array Sequence Analysis/methods , Artificial Intelligence , Bayes Theorem , Computer Simulation , Models, Statistical , Pattern Recognition, Automated/methods
4.
Biometrics ; 60(3): 812-9, 2004 Sep.
Article in English | MEDLINE | ID: mdl-15339306

ABSTRACT

Here we focus on discrimination problems where the number of predictors substantially exceeds the sample size and we propose a Bayesian variable selection approach to multinomial probit models. Our method makes use of mixture priors and Markov chain Monte Carlo techniques to select sets of variables that differ among the classes. We apply our methodology to a problem in functional genomics using gene expression profiling data. The aim of the analysis is to identify molecular signatures that characterize two different stages of rheumatoid arthritis.


Subject(s)
Models, Statistical , Oligonucleotide Array Sequence Analysis/statistics & numerical data , Arthritis, Rheumatoid/classification , Arthritis, Rheumatoid/genetics , Arthritis, Rheumatoid/physiopathology , Bayes Theorem , Biometry , Humans , Markov Chains , Models, Biological , Monte Carlo Method
5.
Math Biosci ; 189(1): 61-73, 2004 May.
Article in English | MEDLINE | ID: mdl-15051414

ABSTRACT

The response of solid tumors to antitumor treatment generally declines markedly with treatment time. Sometimes, a tumor regrows (rebounds) before the end of the treatment period. Studies of the patterns of tumor response to treatment are important, because they may provide useful information for clinical decision-making. We have investigated patterns of tumor response in mouse xenograft tumors by using data from a study conducted at St. Jude Children's Research Hospital. We applied a biexponential non-linear mixed-effects model to an analysis of changes in tumor volume over a given period of treatment. The model gives a good fit to the data, even for small sample sizes. We addressed the relation between the baseline tumor volumes and the decay rates of the first and second stages of the tumor's response to treatment, and we applied sensitive analysis to determine the effect of using different imputed values for missing data. We also proposed a novel approach to a comparison of the antitumor effects of three different treatments, and we used the data from a St. Jude study to demonstrate the potential of this comparison approach in cancer clinical decision-making.


Subject(s)
Antineoplastic Agents/therapeutic use , Camptothecin/analogs & derivatives , Dacarbazine/analogs & derivatives , Neoplasms/drug therapy , Nonlinear Dynamics , Xenograft Model Antitumor Assays , Algorithms , Animals , Antineoplastic Agents/administration & dosage , Antineoplastic Combined Chemotherapy Protocols/therapeutic use , Camptothecin/administration & dosage , Camptothecin/therapeutic use , Computer Simulation , Dacarbazine/administration & dosage , Dacarbazine/therapeutic use , Data Interpretation, Statistical , Humans , Irinotecan , Mice , Rhabdomyosarcoma/drug therapy , Temozolomide , Treatment Outcome
6.
Bioinformatics ; 19(1): 90-7, 2003 Jan.
Article in English | MEDLINE | ID: mdl-12499298

ABSTRACT

UNLABELLED: Selection of significant genes via expression patterns is an important problem in microarray experiments. Owing to small sample size and the large number of variables (genes), the selection process can be unstable. This paper proposes a hierarchical Bayesian model for gene (variable) selection. We employ latent variables to specialize the model to a regression setting and uses a Bayesian mixture prior to perform the variable selection. We control the size of the model by assigning a prior distribution over the dimension (number of significant genes) of the model. The posterior distributions of the parameters are not in explicit form and we need to use a combination of truncated sampling and Markov Chain Monte Carlo (MCMC) based computation techniques to simulate the parameters from the posteriors. The Bayesian model is flexible enough to identify significant genes as well as to perform future predictions. The method is applied to cancer classification via cDNA microarrays where the genes BRCA1 and BRCA2 are associated with a hereditary disposition to breast cancer, and the method is used to identify a set of significant genes. The method is also applied successfully to the leukemia data. SUPPLEMENTARY INFORMATION: http://stat.tamu.edu/people/faculty/bmallick.html.


Subject(s)
Algorithms , Bayes Theorem , Gene Expression Profiling/methods , Genes/genetics , Breast Neoplasms/classification , Breast Neoplasms/genetics , Gene Expression Regulation, Neoplastic , Genes, BRCA1 , Genes, BRCA2 , Genetic Markers/genetics , Genetic Predisposition to Disease/classification , Genetic Predisposition to Disease/genetics , Humans , Leukemia, Myeloid/classification , Leukemia, Myeloid/genetics , Models, Genetic , Models, Statistical , Oligonucleotide Array Sequence Analysis/methods , Precursor Cell Lymphoblastic Leukemia-Lymphoma/classification , Precursor Cell Lymphoblastic Leukemia-Lymphoma/genetics , Sample Size
7.
Comp Funct Genomics ; 4(2): 171-81, 2003.
Article in English | MEDLINE | ID: mdl-18629129

ABSTRACT

The use of large-scale microarray expression profiling to identify predictors of disease class has become of major interest. Beyond their impact in the clinical setting (i.e. improving diagnosis and treatment), these markers are also likely to provide clues on the molecular mechanisms underlining the diseases. In this paper we describe a new method for the identification of multiple gene predictors of disease class. The method is applied to the classification of two forms of arthritis that have a similar clinical endpoint but different underlying molecular mechanisms: rheumatoid arthritis (RA) and osteoarthritis (OA). We aim at both the classification of samples and the location of genes characterizing the different classes. We achieve both goals simultaneously by combining a binary probit model for classification with Bayesian variable selection methods to identify important genes.We find very small sets of genes that lead to good classification results. Some of the selected genes are clearly correlated with known aspects of the biology of arthritis and, in some cases, reflect already known differences between RA and OA.

SELECTION OF CITATIONS
SEARCH DETAIL
...