Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 22
Filter
1.
Am J Epidemiol ; 2024 May 16.
Article in English | MEDLINE | ID: mdl-38751324

ABSTRACT

Our purpose was to investigate the associations between oxaliplatin-induced peripheral neuropathy (OIPN), sociodemographic, and clinical characteristics of older colorectal cancer patients with falls. The study population consisted of older adults diagnosed with colorectal cancer obtained from the Surveillance, Epidemiology, and End Results database combined with Medicare claims. OIPN was defined using specific (OIPN 1) and broader (OIPN 2) definitions of OIPN, based on diagnosis codes. Extensions of the Cox regression model to accommodate repeated events were used to obtain overall hazard ratios (HR) with 95% confidence intervals and the cumulative hazard of fall. The unadjusted risk of fall for colorectal cancer survivors with vs. without OIPN 1 at 36 months of follow-up was 19.6% vs. 14.3%, respectively. The association of OIPN with time to fall was moderate (OIPN 1, HR = 1.37, 95% CI: 1.04, 1.79) to small (OIPN 2, HR = 1.24, 95% CI: 1.01, 1.53). Memantine, opioids, cannabinoids, prior history of fall, female sex, advanced age and disease stage, chronic liver disease, diabetes, and chronic obstructive pulmonary disease all increased the hazard rate of fall. Incorporating fall prevention in cancer care is essential to minimize morbidity and mortality of this serious event in older colorectal cancer survivors.

2.
J Nutr ; 153(10): 3110-3121, 2023 10.
Article in English | MEDLINE | ID: mdl-37604384

ABSTRACT

BACKGROUND: As the expansion of Supplemental Nutrition Assistance Program (SNAP) benefits and pandemic emergency assistance programs ended in late 2021, little is known about subsequent trends in food insufficiency (FI) among households with children. OBJECTIVES: This research examined the association between SNAP participation and FI among households with children in the United States, particularly non-Hispanic Black (Black) and Hispanic households. METHODS: This cross-sectional analysis used Household Pulse Survey data collected from December 2021 to May 2022. Spatial analysis was conducted to visualize FI and SNAP participation rates across 50 states. With state SNAP policy rules as exogenous instruments and sociodemographic factors as control variables, 2-stage probit models were utilized to assess the SNAP and FI association among all (n = 135,074), Black (n = 13,940), and Hispanic households with children (n = 17,869). RESULTS: Approximately 13.9% [95% confidence interval (CI): 13.85%, 13.99%] of households experienced FI, and 20.4% (CI: 20.35%, 20.51%) received SNAP benefits. Among Black and Hispanic households, higher rates were observed, with 23.3% (CI: 23.12%, 23.4%) and 20.8% (CI: 20.61%, 20.95%) experiencing FI and 36.3% (CI: 36.1%, 36.5%) and 26.9% (CI: 26.61%, 27.13%) receiving SNAP benefits. These rates varied across states, ranging from 8% (Utah) to 21.1% (Mississippi) for FI and from 8.8% (Utah) to 32.7% (New Mexico) for SNAP participation. SNAP participants demonstrated a 12% lower likelihood of FI than nonparticipants (CI: -0.18, -0.05, P < 0.001). Among Black households, SNAP participants had a 29% lower likelihood of FI than nonparticipants (CI: -0.54, -0.03, P < 0.001). However, SNAP participation was not significant among Hispanic households (P = 0.99), nor did it narrow the FI gap between Hispanic and non-Hispanic households (P = 0.22). CONCLUSIONS: SNAP participation was associated with lower levels of FI among households with children, particularly for Black households. However, there was no significant association between SNAP participation and FI among Hispanic households with children.


Subject(s)
COVID-19 , Food Assistance , Humans , United States/epidemiology , Child , Cross-Sectional Studies , Poverty , COVID-19/epidemiology , Mississippi , Food Supply
3.
Support Care Cancer ; 31(7): 386, 2023 Jun 09.
Article in English | MEDLINE | ID: mdl-37294347

ABSTRACT

PURPOSE: The purpose of this retrospective cohort study was to evaluate whether several potentially preventive therapies reduced the rate of oxaliplatin-induced peripheral neuropathy (OIPN) in colorectal cancer patients and to assess the relationship of sociodemographic/clinical factors with OIPN diagnosis. METHODS: Data were obtained from the Surveillance, Epidemiology, and End Results database combined with Medicare claims. Eligible patients were diagnosed with colorectal cancer between 2007 and 2015, ≥ 66 years of age, and treated with oxaliplatin. Two definitions were used to denote diagnosis of OIPN based on diagnosis codes: OIPN 1 (specific definition, drug-induced polyneuropathy) and OIPN 2 (broader definition, additional codes for peripheral neuropathy). Cox regression was used to obtain hazard ratios (HR) with 95% confidence intervals (CI) for the relative rate of OIPN within 2 years of oxaliplatin initiation. RESULTS: There were 4792 subjects available for analysis. At 2 years, the unadjusted cumulative incidence of OIPN 1 was 13.1% and 27.1% for OIPN 2. For both outcomes, no therapies reduced the rate of OIPN diagnosis. The anticonvulsants gabapentin and oxcarbazepine/carbamazepine were associated with an increased rate of OIPN (both definitions) as were increasing cycles of oxaliplatin. Compared to younger patients, those 75-84 years of age experienced a 15% decreased rate of OIPN. For OIPN 2, prior peripheral neuropathy and moderate/severe liver disease were also associated with an increased hazard rate. For OIPN 1, state buy-in health insurance coverage was associated with a decreased hazard rate. CONCLUSION: Additional studies are needed to identify preventive therapeutics for OIPN in cancer patients treated with oxaliplatin.


Subject(s)
Antineoplastic Agents , Colorectal Neoplasms , Peripheral Nervous System Diseases , United States , Humans , Aged , Oxaliplatin/adverse effects , Antineoplastic Agents/adverse effects , Retrospective Studies , Organoplatinum Compounds/adverse effects , Medicare , Peripheral Nervous System Diseases/chemically induced , Peripheral Nervous System Diseases/epidemiology , Peripheral Nervous System Diseases/prevention & control , Colorectal Neoplasms/drug therapy
4.
BMC Bioinformatics ; 23(1): 333, 2022 Aug 12.
Article in English | MEDLINE | ID: mdl-35962315

ABSTRACT

BACKGROUND: Influenza A viruses (IAV) exhibit vast genetic mutability and have great zoonotic potential to infect avian and mammalian hosts and are known to be responsible for a number of pandemics. A key computational issue in influenza prevention and control is the identification of molecular signatures with cross-species transmission potential. We propose an adjusted entropy-based host-specific signature identification method that uses a similarity coefficient to incorporate the amino acid substitution information and improve the identification performance. Mutations in the polymerase genes (e.g., PB2) are known to play a major role in avian influenza virus adaptation to mammalian hosts. We thus focus on the analysis of PB2 protein sequences and identify host specific PB2 amino acid signatures. RESULTS: Validation with a set of H5N1 PB2 sequences from 1996 to 2006 results in adjusted entropy having a 40% false negative discovery rate compared to a 60% false negative rate using unadjusted entropy. Simulations across different levels of sequence divergence show a false negative rate of no higher than 10% while unadjusted entropy ranged from 9 to 100%. In addition, under all levels of divergence adjusted entropy never had a false positive rate higher than 9%. Adjusted entropy also identifies important mutations in H1N1pdm PB2 previously identified in the literature that explain changes in divergence between 2008 and 2009 which unadjusted entropy could not identify. CONCLUSIONS: Based on these results, adjusted entropy provides a reliable and widely applicable host signature identification approach useful for IAV monitoring and vaccine development.


Subject(s)
Influenza A Virus, H5N1 Subtype , Influenza A virus , Influenza, Human , Amino Acid Substitution , Amino Acids/genetics , Animals , Humans , Influenza A Virus, H5N1 Subtype/genetics , Influenza A Virus, H5N1 Subtype/metabolism , Influenza A virus/genetics , Influenza A virus/metabolism , Influenza, Human/genetics , Mammals/genetics , Viral Proteins/genetics , Viral Proteins/metabolism
5.
Am J Epidemiol ; 190(2): 239-250, 2021 02 01.
Article in English | MEDLINE | ID: mdl-32902633

ABSTRACT

We investigated characteristics of patients with colon cancer that predicted nonreceipt of posttreatment surveillance testing and the subsequent associations between surveillance status and survival outcomes. This was a retrospective cohort study of the Surveillance, Epidemiology, and End Results database combined with Medicare claims. Patients diagnosed between 2002 and 2009 with disease stages II and III and who were between 66 and 84 years of age were eligible. A minimum of 3 years' follow-up was required, and patients were categorized as having received any surveillance testing (any testing) versus none (no testing). Poisson regression was used to obtain risk ratios with 95% confidence intervals for the relative likelihood of No Testing. Cox models were used to obtain subdistribution hazard ratios with 95% confidence intervals for 5- and 10-year cancer-specific and noncancer deaths. There were 16,009 colon cancer cases analyzed. Patient characteristics that predicted No Testing included older age, Black race, stage III disease, and chemotherapy. Patients in the No Testing group had an increased rate of 10-year cancer death that was greater for patients with stage III disease (subdistribution hazard ratio = 1.79, 95% confidence interval: 1.48, 2.17) than those with stage II disease (subdistribution hazard ratio = 1.41, 95% confidence interval: 1.19, 1.66). Greater efforts are needed to ensure all patients receive the highest quality medical care after diagnosis of colon cancer.


Subject(s)
Colonic Neoplasms/pathology , Colonic Neoplasms/therapy , Age Factors , Aged , Aged, 80 and over , Chemotherapy, Adjuvant , Colonic Neoplasms/mortality , Comoros , Female , Humans , Male , Medicare/statistics & numerical data , Middle Aged , Neoplasm Staging , Odds Ratio , Prognosis , Proportional Hazards Models , Quality of Health Care , Racial Groups , Retrospective Studies , SEER Program/statistics & numerical data , Socioeconomic Factors , United States
6.
Am J Gastroenterol ; 115(6): 924-933, 2020 06.
Article in English | MEDLINE | ID: mdl-32142485

ABSTRACT

OBJECTIVES: Guideline-issuing groups differ regarding the recommendation that patients with stage I colon cancer receive surveillance colonoscopy after cancer-directed surgery. This observational comparative effectiveness study was conducted to evaluate the association between surveillance colonoscopy and colon cancer-specific mortality in early stage patients. METHODS: This was a retrospective cohort study of the Surveillance, Epidemiology, and End Results database combined with Medicare claims. Surveillance colonoscopy was assessed as a time-varying exposure up to 5 years after cancer-directed surgery with the following groups: no colonoscopy, one colonoscopy, and ≥ 2 colonoscopies. Inverse probability of treatment weighting was used to balance covariates. The time-dependent Cox regression model was used to obtain inverse probability of treatment weighting-adjusted hazard ratios (HRs), with 95% confidence intervals (CIs) for 5- and 10-year colon cancer, other cancer, and noncancer causes of death. RESULTS: There were 8,783 colon cancer cases available for analysis. Overall, compared with patients who received one colonoscopy, the no colonoscopy group experienced an increased rate of 10-year colon cancer-specific mortality (HR = 1.63; 95% CI 1.31-2.04) and noncancer death (HR = 1.36; 95% CI 1.25-1.49). Receipt of ≥ 2 colonoscopies was associated with a decreased rate of 10-year colon cancer-specific death (HR = 0.60; 95% CI 0.45-0.79), other cancer death (HR = 0.68; 95% CI 0.53-0.88), and noncancer death (HR = 0.69; 95% CI 0.62-0.76). Five-year cause-specific HRs were similar to 10-year estimates. DISCUSSION: These results support efforts to ensure that stage I patients undergo surveillance colonoscopy after cancer-directed surgery to facilitate early detection of new and recurrent neoplastic lesions.


Subject(s)
Carcinoma/surgery , Colonic Neoplasms/surgery , Neoplasm Recurrence, Local/diagnosis , Age Factors , Aged , Aged, 80 and over , Carcinoma/mortality , Carcinoma/pathology , Cause of Death , Colonic Neoplasms/mortality , Colonic Neoplasms/pathology , Comparative Effectiveness Research , Disease Management , Female , Humans , Information Storage and Retrieval , Male , Medicare , Neoplasm Grading , Neoplasm Staging , Proportional Hazards Models , SEER Program , United States
7.
BMC Cancer ; 19(1): 418, 2019 May 03.
Article in English | MEDLINE | ID: mdl-31053096

ABSTRACT

BACKGROUND: The best strategy for surveillance testing in stage II and III colon cancer patients following curative treatment is unknown. Previous randomized controlled trials have suffered from design limitations and yielded conflicting evidence. This observational comparative effectiveness research study was conducted to provide new evidence on the relationship between post-treatment surveillance testing and survival by overcoming the limitations of previous clinical trials. METHODS: This was a retrospective cohort study of the Surveillance, Epidemiology, and End Results database combined with Medicare claims (SEER-Medicare). Stage II and III colon cancer patients diagnosed from 2002 to 2009 and between 66 to 84 years of age were eligible. Adherence to surveillance testing guidelines-including carcinoembryonic antigen, computed tomography, and colonoscopy-was assessed for each year of follow-up and overall for up to three years post-treatment. Patients were categorized as More Adherent and Less Adherent according to testing guidelines. Patients who received no surveillance testing were excluded. The primary outcome was 5-year cancer-specific survival; 5-year overall survival was the secondary outcome. Inverse probability of treatment weighting (IPTW) using generalized boosted models was employed to balance covariates between the two surveillance groups. IPTW-adjusted survival curves comparing the two groups were performed by the Kaplan-Meier method. Weighted Cox regression was used to obtain hazard ratios (HRs) with 95% confidence intervals (CIs) for the relative risk of death for the Less Adherent group versus the More Adherent group. RESULTS: There were 17,860 stage II and III colon cancer cases available for analysis. Compared to More Adherent patients, Less Adherent patients experienced slightly better 5-year cancer-specific survival (HR = 0.83, 95% CI 0.76-0.90) and worse 5-year noncancer-specific survival (HR = 1.61, 95% CI 1.43-1.82) for years 2 to 5 of follow-up. There was no difference between the groups in overall survival (HR = 1.04, 95% CI 0.98-1.10). CONCLUSIONS: More surveillance testing did not improve 5-year cancer-specific survival compared to less testing and there was no difference between the groups in overall survival. The results of this study support a risk-stratified, shared decision-making surveillance strategy to optimize clinical and patient-centered outcomes for colon cancer patients in the survivorship phase of care.


Subject(s)
Colonic Neoplasms/pathology , Colonic Neoplasms/therapy , Patient Compliance/statistics & numerical data , Population Surveillance/methods , Aged , Aged, 80 and over , Comparative Effectiveness Research , Female , Humans , Male , Neoplasm Staging , Retrospective Studies , SEER Program , Survival Analysis
8.
BMJ Open ; 8(4): e022393, 2018 04 28.
Article in English | MEDLINE | ID: mdl-29705770

ABSTRACT

INTRODUCTION: Although the colorectal cancer (CRC) mortality rate has significantly improved over the past several decades, many patients will have a recurrence following curative treatment. Despite this high risk of recurrence, adherence to CRC surveillance testing guidelines is poor which increases cancer-related morbidity and potentially, mortality. Several randomised controlled trials (RCTs) with varying surveillance strategies have yielded conflicting evidence regarding the survival benefit associated with surveillance testing. However, due to differences in study protocols and limitations of sample size and length of follow-up, the RCT may not be the best study design to evaluate this relationship. An observational comparative effectiveness research study can overcome the sample size/follow-up limitations of RCT designs while assessing real-world variability in receipt of surveillance testing to provide much needed evidence on this important clinical issue. The gap in knowledge that this study will address concerns whether adherence to National Comprehensive Cancer Network CRC surveillance guidelines improves survival. METHODS AND ANALYSIS: Patients with colon and rectal cancer aged 66-84 years, who have been diagnosed between 2002 and 2008 and have been included in the Surveillance, Epidemiology, and End Results-Medicare database, are eligible for this retrospective cohort study. To minimise bias, patients had to survive at least 12 months following the completion of treatment. Adherence to surveillance testing up to 5 years post-treatment will be assessed in each year of follow-up and overall. Binomial regression will be used to assess the association between patients' characteristics and adherence. Survival analysis will be conducted to assess the association between adherence and 5-year survival. ETHICS AND DISSEMINATION: This study was approved by the National Cancer Institute and the Institutional Review Board of the University of Central Florida. The results of this study will be disseminated by publishing in the peer-reviewed scientific literature, presentation at national/international scientific conferences and posting through social media.


Subject(s)
Colorectal Neoplasms , Aged , Aged, 80 and over , Colorectal Neoplasms/mortality , Colorectal Neoplasms/therapy , Florida , Humans , Medicare , Neoplasm Recurrence, Local , Retrospective Studies , SEER Program , Survival Analysis , United States
9.
Gene ; 618: 8-13, 2017 Jun 30.
Article in English | MEDLINE | ID: mdl-28322997

ABSTRACT

The coding pattern of protein can greatly affect the prediction accuracy of protein secondary structure. In this paper, a novel hybrid coding method based on the physicochemical properties of amino acids and tendency factors is proposed for the prediction of protein secondary structure. The principal component analysis (PCA) is first applied to the physicochemical properties of amino acids to construct a 3-bit-code, and then the 3 tendency factors of amino acids are calculated to generate another 3-bit-code. Two 3-bit-codes are fused to form a novel hybrid 6-bit-code. Furthermore, we make a geometry-based similarity comparison of the protein primary structure between the reference set and the test set before the secondary structure prediction. We finally use the support vector machine (SVM) to predict those amino acids which are not detected by the primary structure similarity comparison. Experimental results show that our method achieves a satisfactory improvement in accuracy in the prediction of protein secondary structure.


Subject(s)
Protein Structure, Secondary , Sequence Alignment/methods , Sequence Analysis, Protein/methods , Amino Acids/chemistry , Support Vector Machine
10.
BMC Bioinformatics ; 17: 287, 2016 Jul 21.
Article in English | MEDLINE | ID: mdl-27439701

ABSTRACT

BACKGROUND: Clustering is a common technique used by molecular biologists to group homologous sequences and study evolution. There remain issues such as how to cluster molecular sequences accurately and in particular how to evaluate the certainty of clustering results. RESULTS: We presented a model-based clustering method to analyze molecular sequences, described a subset bootstrap scheme to evaluate a certainty of the clusters, and showed an intuitive way using 3D visualization to examine clusters. We applied the above approach to analyze influenza viral hemagglutinin (HA) sequences. Nine clusters were estimated for high pathogenic H5N1 avian influenza, which agree with previous findings. The certainty for a given sequence that can be correctly assigned to a cluster was all 1.0 whereas the certainty for a given cluster was also very high (0.92-1.0), with an overall clustering certainty of 0.95. For influenza A H7 viruses, ten HA clusters were estimated and the vast majority of sequences could be assigned to a cluster with a certainty of more than 0.99. The certainties for clusters, however, varied from 0.40 to 0.98; such certainty variation is likely attributed to the heterogeneity of sequence data in different clusters. In both cases, the certainty values estimated using the subset bootstrap method are all higher than those calculated based upon the standard bootstrap method, suggesting our bootstrap scheme is applicable for the estimation of clustering certainty. CONCLUSIONS: We formulated a clustering analysis approach with the estimation of certainties and 3D visualization of sequence data. We analysed 2 sets of influenza A HA sequences and the results indicate our approach was applicable for clustering analysis of influenza viral sequences.


Subject(s)
Influenza A Virus, H5N1 Subtype/classification , Models, Theoretical , Animals , Base Sequence , Birds , Cluster Analysis , Hemagglutinins, Viral/chemistry , Influenza A Virus, H5N1 Subtype/metabolism , Influenza in Birds/virology , Phylogeny
11.
Stat Med ; 33(11): 1853-66, 2014 May 20.
Article in English | MEDLINE | ID: mdl-24420973

ABSTRACT

Health indices provide information to the general public on the health condition of the community. They can also be used to inform the government's policy making, to evaluate the effect of a current policy or healthcare program, or for program planning and priority setting. It is a common practice that the health indices across different geographic units are ranked and the ranks are reported as fixed values. We argue that the ranks should be viewed as random and hence should be accompanied by an indication of precision (i.e., the confidence intervals). A technical difficulty in doing so is how to account for the dependence among the ranks in the construction of confidence intervals. In this paper, we propose a novel Monte Carlo method for constructing the individual and simultaneous confidence intervals of ranks for age-adjusted rates. The proposed method uses as input age-specific counts (of cases of disease or deaths) and their associated populations. We have further extended it to the case in which only the age-adjusted rates and confidence intervals are available. Finally, we demonstrate the proposed method to analyze US age-adjusted cancer incidence rates and mortality rates for cancer and other diseases by states and counties within a state using a website that will be publicly available. The results show that for rare or relatively rare disease (especially at the county level), ranks are essentially meaningless because of their large variability, while for more common disease in larger geographic units, ranks can be effectively utilized.


Subject(s)
Bayes Theorem , Confidence Intervals , Data Interpretation, Statistical , Monte Carlo Method , Neoplasms/epidemiology , Age Factors , Algorithms , Computer Simulation , Humans , Incidence , Neoplasms/mortality , United States
12.
Genet Epidemiol ; 37(8): 814-9, 2013 Dec.
Article in English | MEDLINE | ID: mdl-23959976

ABSTRACT

After genetic regions have been identified in genomewide association studies (GWAS), investigators often follow up with more targeted investigations of specific regions. These investigations typically are based on single nucleotide polymorphisms (SNPs) with dense coverage of a region. Methods are thus needed to test the hypothesis of any association in given genetic regions. Several approaches for combining P-values obtained from testing individual SNP hypothesis tests are available. We recently proposed a sequential procedure for testing the global null hypothesis of no association in a region. When this global null hypothesis is rejected, this method provides a list of significant hypotheses and has weak control of the family-wise error rate. In this paper, we devise a permutation-based version of the test that accounts for correlations of tests based on SNPs in the same genetic region. Based on simulated data, the method has correct control of the type I error rate and higher or comparable power to other tests.


Subject(s)
Genome-Wide Association Study , Genomics , Algorithms , Humans , Linkage Disequilibrium , Models, Genetic , Phenotype , Polymorphism, Single Nucleotide/genetics , Research Design
13.
Genet Epidemiol ; 36(1): 22-35, 2012 Jan.
Article in English | MEDLINE | ID: mdl-22147673

ABSTRACT

Over the past several years, genome-wide association studies (GWAS) have succeeded in identifying hundreds of genetic markers associated with common diseases. However, most of these markers confer relatively small increments of risk and explain only a small proportion of familial clustering. To identify obstacles to future progress in genetic epidemiology research and provide recommendations to NIH for overcoming these barriers, the National Cancer Institute sponsored a workshop entitled "Next Generation Analytic Tools for Large-Scale Genetic Epidemiology Studies of Complex Diseases" on September 15-16, 2010. The goal of the workshop was to facilitate discussions on (1) statistical strategies and methods to efficiently identify genetic and environmental factors contributing to the risk of complex disease; and (2) how to develop, apply, and evaluate these strategies for the design, analysis, and interpretation of large-scale complex disease association studies in order to guide NIH in setting the future agenda in this area of research. The workshop was organized as a series of short presentations covering scientific (gene-gene and gene-environment interaction, complex phenotypes, and rare variants and next generation sequencing) and methodological (simulation modeling and computational resources and data management) topic areas. Specific needs to advance the field were identified during each session and are summarized.


Subject(s)
Gene-Environment Interaction , Genome-Wide Association Study , Molecular Epidemiology/methods , Data Mining/methods , Genetic Variation , Humans , National Institutes of Health (U.S.) , Neoplasms/genetics , Phenotype , United States
14.
Methods Mol Biol ; 674: 161-77, 2010.
Article in English | MEDLINE | ID: mdl-20827591

ABSTRACT

Localizing the binding sites of regulatory proteins is becoming increasingly feasible and accurate. This is due to dramatic progress not only in chromatin immunoprecipitation combined by next-generation sequencing (ChIP-seq) but also in advanced statistical analyses. A fundamental issue, however, is the alarming number of false positive predictions. This problem can be remedied by improved peak calling methods of twin peaks, one at each strand of the DNA, kernel density estimators, and false discovery rate estimations based on control libraries. Predictions are filtered by de novo motif discovery in the peak environments. These methods have been implemented in, among others, Valouev et al.'s Quantitative Enrichment of Sequence Tags (QuEST) software tool. We demonstrate the prediction of the human growth-associated binding protein (GABPalpha) based on ChIP-seq observations.


Subject(s)
Chromatin Immunoprecipitation , Sequence Analysis, DNA , Transcription Factors/metabolism , Binding Sites , False Positive Reactions , GA-Binding Protein Transcription Factor/metabolism , Humans , Internet , Jurkat Cells , Probability , Regulatory Sequences, Nucleic Acid/genetics , Reproducibility of Results , Software
15.
J Comput Biol ; 17(2): 177-87, 2010 Feb.
Article in English | MEDLINE | ID: mdl-20078228

ABSTRACT

In microarray data analysis, false discovery rate (FDR) is now widely accepted as the control criterion to account for multiple hypothesis testing. The proportion of equivalently expressed genes (pi(0)) is a key component to be estimated in the estimation of FDR. Some commonly used pi(0) estimators (BUM, SPLOSH, QVALUE, and LBE ) are all based on p-values, and they are essentially upper bounds of pi(0). The simulations we carried out show that these four methods significantly overestimate the true pi(0) when differentially expressed genes and equivalently expressed genes are not well separated. To solve this problem, we first introduce a novel way of transforming the test statistics to make them symmetric about 0. Then we propose a pi(0) estimator based on the transformed test statistics using the symmetry assumption. Real data application and simulation both show that the pi(0) estimate from our method is less conservative than BUM, SPLOSH, QVALUE, and LBE in most of the cases. Simulation results also show that our estimator always has the least mean squared error among these five methods.


Subject(s)
Biomarkers, Tumor/genetics , Computational Biology , Gene Expression Profiling , Leukemia, Myeloid, Acute/genetics , Models, Statistical , Oligonucleotide Array Sequence Analysis , Precursor Cell Lymphoblastic Leukemia-Lymphoma/genetics , Algorithms , Biomarkers, Tumor/metabolism , False Positive Reactions , Humans , Leukemia, Myeloid, Acute/metabolism , Pattern Recognition, Automated , Precursor Cell Lymphoblastic Leukemia-Lymphoma/metabolism
16.
J Pharmacokinet Pharmacodyn ; 35(5): 553-71, 2008 Oct.
Article in English | MEDLINE | ID: mdl-18989761

ABSTRACT

A spline-enhanced ordinary differential equation (ODE) method is proposed for developing a proper parametric kinetic ODE model and is shown to be a useful approach to PK/PD model development. The new method differs substantially from a previously proposed model development approach using a stochastic differential equation (SDE)-based method. In the SDE-based method, a Gaussian diffusion term is introduced into an ODE to quantify the system noise. In our proposed method, we assume an ODE system with form dx/dt = A(t)x + B(t) where B(t) is a nonparametric function vector that is estimated using penalized splines. B(t) is used to construct a quantitative measure of model uncertainty useful for finding the proper model structure for a given data set. By means of two examples with simulated data, we demonstrate that the spline-enhanced ODE method can provide model diagnostics and serve as a basis for systematic model development similar to the SDE-based method. We compare and highlight the differences between the SDE-based and the spline-enhanced ODE methods of model development. We conclude that the spline-enhanced ODE method can be useful for PK/PD modeling since it is based on a relatively uncomplicated estimation algorithm which can be implemented with readily available software, provides numerically stable, robust estimation for many models, is distribution-free and allows for identification and accommodation of model deficiencies due to model misspecification.


Subject(s)
Models, Biological , Pharmacokinetics , Pharmacology/statistics & numerical data , Algorithms , Animals , Computer Simulation , Humans , Pharmaceutical Preparations/administration & dosage , Pharmaceutical Preparations/blood , Pharmaceutical Preparations/metabolism , Statistics, Nonparametric , Stochastic Processes
17.
J Pharmacokinet Pharmacodyn ; 35(4): 443-63, 2008 Aug.
Article in English | MEDLINE | ID: mdl-18781382

ABSTRACT

Motivated by the use of semiparametric nonlinear mixed-effects modeling on longitudinal data, we develop a new semiparametric modeling approach to address potential structural model misspecification for population pharmacokinetic/pharmacodynamic (PK/PD) analysis. Specifically, we use a set of ordinary differential equations (ODEs) with form dx/dt = A(t)x + B(t) where B(t) is a nonparametric function that is estimated using penalized splines. The inclusion of a nonparametric function in the ODEs makes identification of structural model misspecification feasible by quantifying the model uncertainty and provides flexibility for accommodating possible structural model deficiencies. The resulting model will be implemented in a nonlinear mixed-effects modeling setup for population analysis. We illustrate the method with an application to cefamandole data and evaluate its performance through simulations.


Subject(s)
Models, Statistical , Pharmacokinetics , Pharmacology/statistics & numerical data , Algorithms , Humans , Mandelic Acids/administration & dosage , Mandelic Acids/pharmacokinetics , Muscarinic Antagonists/administration & dosage , Muscarinic Antagonists/pharmacokinetics , Nonlinear Dynamics , Software
18.
BMC Bioinformatics ; 9 Suppl 6: S15, 2008 May 28.
Article in English | MEDLINE | ID: mdl-18541050

ABSTRACT

BACKGROUND: Historically, two categories of computational algorithms (alignment-based and alignment-free) have been applied to sequence comparison-one of the most fundamental issues in bioinformatics. Multiple sequence alignment, although dominantly used by biologists, possesses both fundamental as well as computational limitations. Consequently, alignment-free methods have been explored as important alternatives in estimating sequence similarity. Of the alignment-free methods, the string composition vector (CV) methods, which use the frequencies of nucleotide or amino acid strings to represent sequence information, show promising results in genome sequence comparison of prokaryotes. The existing CV-based methods, however, suffer certain statistical problems, thereby underestimating the amount of evolutionary information in genetic sequences. RESULTS: We show that the existing string composition based methods have two problems, one related to the Markov model assumption and the other associated with the denominator of the frequency normalization equation. We propose an improved complete composition vector method under the assumption of a uniform and independent model to estimate sequence information contributing to selection for sequence comparison. Phylogenetic analyses using both simulated and experimental data sets demonstrate that our new method is more robust compared with existing counterparts and comparable in robustness with alignment-based methods. CONCLUSION: We observed two problems existing in the currently used string composition methods and proposed a new robust method for the estimation of evolutionary information of genetic sequences. In addition, we discussed that it might not be necessary to use relatively long strings to build a complete composition vector (CCV), due to the overlapping nature of vector strings with a variable length. We suggested a practical approach for the choice of an optimal string length to construct the CCV.


Subject(s)
Algorithms , DNA/chemistry , DNA/genetics , Sequence Alignment/methods , Sequence Analysis, DNA/methods , Base Sequence , Molecular Sequence Data
19.
Bioinformatics ; 24(15): 1655-61, 2008 Aug 01.
Article in English | MEDLINE | ID: mdl-18573796

ABSTRACT

MOTIVATION: Recent attempts to account for multiple testing in the analysis of microarray data have focused on controlling the false discovery rate (FDR), which is defined as the expected percentage of the number of false positive genes among the claimed significant genes. As a consequence, the accuracy of the FDR estimators will be important for correctly controlling FDR. Xie et al. found that the standard permutation method of estimating FDR is biased and proposed to delete the predicted differentially expressed (DE) genes in the estimation of FDR for one-sample comparison. However, we notice that the formula of the FDR used in their paper is incorrect. This makes the comparison results reported in their paper unconvincing. Other problems with their method include the biased estimation of FDR caused by over- or under-deletion of DE genes in the estimation of FDR and by the implicit use of an unreasonable estimator of the true proportion of equivalently expressed (EE) genes. Due to the great importance of accurate FDR estimation in microarray data analysis, it is necessary to point out such problems and propose improved methods. RESULTS: Our results confirm that the standard permutation method overestimates the FDR. With the correct FDR formula, we show the method of Xie et al. always gives biased estimation of FDR: it overestimates when the number of claimed significant genes is small, and underestimates when the number of claimed significant genes is large. To overcome these problems, we propose two modifications. The simulation results show that our estimator gives more accurate estimation.


Subject(s)
Algorithms , Artifacts , Data Interpretation, Statistical , False Positive Reactions , Gene Expression Profiling/methods , Oligonucleotide Array Sequence Analysis/methods
20.
Funct Integr Genomics ; 8(3): 181-6, 2008 Aug.
Article in English | MEDLINE | ID: mdl-18210172

ABSTRACT

The finite mixture model approach has attracted much attention in analyzing microarray data due to its robustness to the excessive variability which is common in the microarray data. Pan (2003) proposed to use the normal mixture model method (MMM) to estimate the distribution of a test statistic and its null distribution. However, considering the fact that the test statistic is often of t-type, our studies find that the rejection region from MMM is often significantly larger than the correct rejection region, resulting an inflated type I error. This motivates us to propose the t-mixture model (TMM) approach. In this paper, we demonstrate that TMM provides significantly more accurate control of the probability of making type I errors (hence of the familywise error rate) than MMM. Finally, TMM is applied to the well-known leukemia data of Golub et al. (1999). The results are compared with those obtained from MMM.


Subject(s)
Gene Expression , Models, Genetic , Models, Statistical , Oligonucleotide Array Sequence Analysis/methods , Algorithms , Computer Simulation , Genome, Human , Humans , Leukemia, Myeloid/genetics , Likelihood Functions , Precursor Cell Lymphoblastic Leukemia-Lymphoma/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...