Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
1.
Stat Methods Med Res ; 30(10): 2313-2328, 2021 10.
Article in English | MEDLINE | ID: mdl-34468235

ABSTRACT

Propensity score matching is widely used to determine the effects of treatments in observational studies. Competing risk survival data are common to medical research. However, there is a paucity of propensity score matching studies related to competing risk survival data with missing causes of failure. In this study, we provide guidelines for estimating the treatment effect on the cumulative incidence function when using propensity score matching on competing risk survival data with missing causes of failure. We examined the performances of different methods for imputing the data with missing causes. We then evaluated the gain from the missing cause imputation in an extensive simulation study and applied the proposed data imputation method to the data from a study on the risk of hepatocellular carcinoma in patients with chronic hepatitis B and chronic hepatitis C.


Subject(s)
Carcinoma, Hepatocellular , Liver Neoplasms , Carcinoma, Hepatocellular/etiology , Causality , Computer Simulation , Humans , Liver Neoplasms/etiology , Propensity Score
2.
Stat Methods Med Res ; 26(2): 661-673, 2017 04.
Article in English | MEDLINE | ID: mdl-25305195

ABSTRACT

In many cohort studies, time to events such as disease recurrence is recorded in an interval-censored format. An important objective is to predict patient outcomes. Clinicians are interested in predictive covariates. Prediction rules based on the receiver operating characteristic curve alone are not related to the survival endpoint. We propose a model evaluation strategy to leverage the predictive accuracy based on negative predictive functions. Our proposed method makes very few assumptions and only requires a working model to obtain the regression coefficients. A nonparametric estimate of the predictive accuracy provides a simple and flexible approach for model evaluation to interval-censored survival outcomes. The implementation effort is minimal, therefore this method has an increased potential for immediate use in biomedical data analyses. Simulation studies and a breast cancer trial example further illustrate the practical advantages of this approach.


Subject(s)
Models, Statistical , Survival Analysis , Biostatistics/methods , Breast Neoplasms/drug therapy , Clinical Trials, Phase III as Topic/statistics & numerical data , Computer Simulation , Disease-Free Survival , Female , Humans , Prognosis , Proportional Hazards Models , Recurrence , Statistics, Nonparametric
3.
Stat Methods Med Res ; 25(4): 1718-35, 2016 08.
Article in English | MEDLINE | ID: mdl-23907782

ABSTRACT

Doubly censored data often arise in medical studies of disease progression involving two related events for which both an originating and a terminating event are interval-censored. Although regression modeling for such doubly censored data may be complicated, we propose a simple semiparametric regression modeling strategy based on jackknife pseudo-observations obtained using nonparametric estimators of the survival function. Inference is carried out via generalized estimating equations. Simulations studies show that the proposed method produces virtually unbiased covariate effect estimates, even for moderate sample sizes. A prostate cancer study example illustrates the practical advantages of the proposed approach.


Subject(s)
Clinical Trials, Phase II as Topic/methods , Models, Statistical , Prostatic Neoplasms/drug therapy , Disease Progression , Humans , Male , Proportional Hazards Models , Regression Analysis , Sample Size , Statistics, Nonparametric , Survival Analysis
4.
BMC Genomics ; 13 Suppl 8: S9, 2012.
Article in English | MEDLINE | ID: mdl-23281802

ABSTRACT

BACKGROUND: RNA sequencing (RNA-seq) has become a major tool for biomedical research. A key step in analyzing RNA-seq data is to infer the origin of short reads in the source genome, and for this purpose, many read alignment/mapping software programs have been developed. Usually, the majority of mappable reads can be mapped to one unambiguous genomic location, and these reads are called unique reads. However, a considerable proportion of mappable reads can be aligned to more than one genomic location with the same or similar fidelities, and they are called "multireads". Allocating these multireads is challenging but critical for interpreting RNA-seq data. We recently developed a Bayesian stochastic model that allocates multireads more accurately than alternative methods (Ji et al. Biometrics 2011). RESULTS: In order to serve a greater biological community, we have implemented this method in a stand-alone, efficient, and user-friendly software package, BM-Map. BM-Map takes SAM (Sequence Alignment/Map), the most popular read alignment format, as the standard input; then based on the Bayesian model, it calculates mapping probabilities of multireads for competing genomic loci; and BM-Map generates the output by adding mapping probabilities to the original SAM file so that users can easily perform downstream analyses. The program is available in three common operating systems, Linux, Mac and PC. Moreover, we have built a dedicated website, http://bioinformatics.mdanderson.org/main/BM-Map, which includes free downloads, detailed tutorials and illustration examples. CONCLUSIONS: We have developed a stand-alone, efficient, and user-friendly software package for accurately allocating multireads, which is an important addition to our previous methodology paper. We believe that this bioinformatics tool will greatly help RNA-seq and related applications reach their full potential in life science research.


Subject(s)
Sequence Analysis, RNA , Software , Algorithms , Bayes Theorem , Chromosome Mapping , Internet , Polymorphism, Single Nucleotide , User-Computer Interface
5.
Biometrics ; 67(4): 1215-24, 2011 Dec.
Article in English | MEDLINE | ID: mdl-21517792

ABSTRACT

Next-generation sequencing (NGS) technology generates millions of short reads, which provide valuable information for various aspects of cellular activities and biological functions. A key step in NGS applications (e.g., RNA-Seq) is to map short reads to correct genomic locations within the source genome. While most reads are mapped to a unique location, a significant proportion of reads align to multiple genomic locations with equal or similar numbers of mismatches; these are called multireads. The ambiguity in mapping the multireads may lead to bias in downstream analyses. Currently, most practitioners discard the multireads in their analysis, resulting in a loss of valuable information, especially for the genes with similar sequences. To refine the read mapping, we develop a Bayesian model that computes the posterior probability of mapping a multiread to each competing location. The probabilities are used for downstream analyses, such as the quantification of gene expression. We show through simulation studies and RNA-Seq analysis of real life data that the Bayesian method yields better mapping than the current leading methods. We provide a C++ program for downloading that is being packaged into a user-friendly software.


Subject(s)
Algorithms , Bayes Theorem , Data Interpretation, Statistical , RNA/genetics , Sequence Alignment/methods , Sequence Analysis, RNA/methods , Software , Base Sequence , Molecular Sequence Data
6.
Biom J ; 52(2): 222-32, 2010 Apr.
Article in English | MEDLINE | ID: mdl-20391535

ABSTRACT

When drawing large-scale simultaneous inference, such as in genomics and imaging problems, multiplicity adjustments should be made, since, otherwise, one would be faced with an inflated type I error. Numerous methods are available to estimate the proportion of true null hypotheses pi(0), among a large number of hypotheses tested. Many methods implicitly assume that the pi(0) is large, that is, close to 1. However, in practice, mid-range pi(0) values are frequently encountered and many of the widely used methods tend to produce highly variable or biased estimates of pi(0). As a remedy in such situations, we propose a hierarchical Bayesian model that produces an estimator of pi(0) that exhibits considerably less bias and is more stable. Simulation studies seem indicative of good method performance even when low-to-moderate correlation exists among test statistics. Method performance is assessed in simulated settings and its practical usefulness is illustrated in an application to a type II diabetes study.


Subject(s)
Algorithms , Biometry/methods , Data Interpretation, Statistical , Models, Biological , Models, Statistical , Computer Simulation
7.
Biometrics ; 64(4): 1223-30, 2008 Dec.
Article in English | MEDLINE | ID: mdl-18355383

ABSTRACT

SUMMARY: Inverse dose-response estimation refers to the inference of an effective dose of some agent that gives a desired probability of response, say 0.5. We consider inverse dose response for two agents, an application that has not received much attention in the literature. Through the posterior profiling technique (Hsu, 1995, The Canadian Journal of Statistics 23, 399-410), we propose a Bayesian method in which we approximate the marginal posterior distribution of an effective dose using a profile posterior distribution, and obtain the maximum a posteriori (MAP) estimate for the effective dose. We then employ an adaptive direction sampling algorithm to obtain the highest posterior density (HPD) credible region for the effective dose. Using the MAP and HPD estimates, investigators will be able to simultaneously calibrate the levels of two agents in dose-response studies. We illustrate our proposed Bayesian method through a simulation study and two practical examples.


Subject(s)
Bayes Theorem , Dose-Response Relationship, Drug , Algorithms , Computer Simulation , Drug Dosage Calculations , Humans
8.
PLoS Genet ; 2(11): e203, 2006 Nov 24.
Article in English | MEDLINE | ID: mdl-17166056

ABSTRACT

Nonsense-mediated mRNA decay (NMD) is a eukaryotic mechanism of RNA surveillance that selectively eliminates aberrant transcripts coding for potentially deleterious proteins. NMD also functions in the normal repertoire of gene expression. In Saccharomyces cerevisiae, hundreds of endogenous RNA Polymerase II transcripts achieve steady-state levels that depend on NMD. For some, the decay rate is directly influenced by NMD (direct targets). For others, abundance is NMD-sensitive but without any effect on the decay rate (indirect targets). To distinguish between direct and indirect targets, total RNA from wild-type (Nmd(+)) and mutant (Nmd(-)) strains was probed with high-density arrays across a 1-h time window following transcription inhibition. Statistical models were developed to describe the kinetics of RNA decay. 45% +/- 5% of RNAs targeted by NMD were predicted to be direct targets with altered decay rates in Nmd(-) strains. Parallel experiments using conventional methods were conducted to empirically test predictions from the global experiment. The results show that the global assay reliably distinguished direct versus indirect targets. Different types of targets were investigated, including transcripts containing adjacent, disabled open reading frames, upstream open reading frames, and those prone to out-of-frame initiation of translation. Known targeting mechanisms fail to account for all of the direct targets of NMD, suggesting that additional targeting mechanisms remain to be elucidated. 30% of the protein-coding targets of NMD fell into two broadly defined functional themes: those affecting chromosome structure and behavior and those affecting cell surface dynamics. Overall, the results provide a preview for how expression profiles in multi-cellular eukaryotes might be impacted by NMD. Furthermore, the methods for analyzing decay rates on a global scale offer a blueprint for new ways to study mRNA decay pathways in any organism where cultured cell lines are available.


Subject(s)
Gene Expression Regulation, Fungal , RNA Interference/physiology , RNA Processing, Post-Transcriptional/physiology , RNA Stability/physiology , Saccharomyces cerevisiae/metabolism , Codon, Initiator/analysis , Computer Simulation , Gene Expression Profiling/methods , Gene Expression Regulation, Fungal/drug effects , Half-Life , Models, Biological , Models, Theoretical , Open Reading Frames/genetics , Organisms, Genetically Modified , Protein Biosynthesis , Pyrrolidinones/pharmacology , RNA Processing, Post-Transcriptional/drug effects , RNA Stability/drug effects , RNA, Messenger/classification , Reproducibility of Results , Saccharomyces cerevisiae/drug effects
9.
Bioinformatics ; 21(7): 1055-61, 2005 Apr 01.
Article in English | MEDLINE | ID: mdl-15514000

ABSTRACT

MOTIVATION: The classification of samples using gene expression profiles is an important application in areas such as cancer research and environmental health studies. However, the classification is usually based on a small number of samples, and each sample is a long vector of thousands of gene expression levels. An important issue in parametric modeling for so many gene expression levels is the control of the number of nuisance parameters in the model. Large models often lead to intensive or even intractable computation, while small models may be inadequate for complex data. METHODOLOGY: We propose a two-step empirical Bayes classification method as a solution to this issue. At the first step, we use the model-based cluster algorithm with a non-traditional purpose of assigning gene expression levels to form abundance groups. At the second step, by assuming the same variance for all the genes in the same group, we substantially reduce the number of nuisance parameters in our statistical model. RESULTS: The proposed model is more parsimonious, which leads to efficient computation under an empirical Bayes estimation procedure. We consider two real examples and simulate data using our method. Desired low classification error rates are obtained even when a large number of genes are pre-selected for class prediction.


Subject(s)
Algorithms , Gene Expression Profiling/methods , Gene Expression Regulation/physiology , Models, Biological , Oligonucleotide Array Sequence Analysis/methods , Pattern Recognition, Automated/methods , Signal Transduction/physiology , Bayes Theorem , Biomarkers, Tumor/metabolism , Cluster Analysis , Humans , Leukemia/diagnosis , Leukemia/metabolism , Models, Statistical , Neoplasm Proteins/metabolism
SELECTION OF CITATIONS
SEARCH DETAIL
...