Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Front Public Health ; 11: 1259410, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38146480

RESUMO

Introduction: There is a vast literature on the performance of different short-term forecasting models for country specific COVID-19 cases, but much less research with respect to city level cases. This paper employs daily case counts for 25 Metropolitan Statistical Areas (MSAs) in the U.S. to evaluate the efficacy of a variety of statistical forecasting models with respect to 7 and 28-day ahead predictions. Methods: This study employed Gradient Boosted Regression Trees (GBRT), Linear Mixed Effects (LME), Susceptible, Infectious, or Recovered (SIR), and Seasonal Autoregressive Integrated Moving Average (SARIMA) models to generate daily forecasts of COVID-19 cases from November 2020 to March 2021. Results: Consistent with other research that have employed Machine Learning (ML) based methods, we find that Median Absolute Percentage Error (MAPE) values for both 7-day ahead and 28-day ahead predictions from GBRTs are lower than corresponding values from SIR, Linear Mixed Effects (LME), and Seasonal Autoregressive Integrated Moving Average (SARIMA) specifications for the majority of MSAs during November-December 2020 and January 2021. GBRT and SARIMA models do not offer high-quality predictions for February 2021. However, SARIMA generated MAPE values for 28-day ahead predictions are slightly lower than corresponding GBRT estimates for March 2021. Discussion: The results of this research demonstrate that basic ML models can lead to relatively accurate forecasts at the local level, which is important for resource allocation decisions and epidemiological surveillance by policymakers.


Assuntos
COVID-19 , Humanos , Cidades/epidemiologia , Estações do Ano , Incidência , COVID-19/epidemiologia , Modelos Estatísticos
2.
Nat Commun ; 14(1): 4050, 2023 07 08.
Artigo em Inglês | MEDLINE | ID: mdl-37422469

RESUMO

Single-cell RNA sequencing (scRNA-seq) has revolutionized our understanding of cellular heterogeneity in health and disease. However, the lack of physical relationships among dissociated cells has limited its applications. To address this issue, we present CeLEry (Cell Location recovEry), a supervised deep learning algorithm that leverages gene expression and spatial location relationships learned from spatial transcriptomics to recover the spatial origins of cells in scRNA-seq. CeLEry has an optional data augmentation procedure via a variational autoencoder, which improves the method's robustness and allows it to overcome noise in scRNA-seq data. We show that CeLEry can infer the spatial origins of cells in scRNA-seq at multiple levels, including 2D location and spatial domain of a cell, while also providing uncertainty estimates for the recovered locations. Our comprehensive benchmarking evaluations on multiple datasets generated from brain and cancer tissues using Visium, MERSCOPE, MERFISH, and Xenium demonstrate that CeLEry can reliably recover the spatial location information for cells using scRNA-seq data.


Assuntos
Apium , Transcriptoma , Transcriptoma/genética , Apium/genética , Análise da Expressão Gênica de Célula Única , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Perfilação da Expressão Gênica/métodos
3.
J Appl Stat ; 50(7): 1611-1634, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37197758

RESUMO

Autoregressive (AR) models are useful in time series analysis. Inferences under such models are distorted in the presence of measurement error, a common feature in applications. In this article, we establish analytical results for quantifying the biases of the parameter estimation in AR models if the measurement error effects are neglected. We consider two measurement error models to describe different data contamination scenarios. We propose an estimating equation approach to estimate the AR model parameters with measurement error effects accounted for. We further discuss forecasting using the proposed method. Our work is inspired by COVID-19 data, which are error-contaminated due to multiple reasons including those related to asymptomatic cases and varying incubation periods. We implement the proposed method by conducting sensitivity analyses and forecasting the fatality rate of COVID-19 over time for the four most populated provinces in Canada. The results suggest that incorporating or not incorporating measurement error effects may yield rather different results for parameter estimation and forecasting.

4.
Econ Anal Policy ; 78: 225-242, 2023 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-36941918

RESUMO

The enactment of COVID-19 policies in Canada falls under provincial jurisdiction. This study exploits time-series variation across four Canadian provinces to evaluate the effects of stricter COVID-19 policies on daily case counts. Employing data from this time-period allows an evaluation of the efficacy of policies independent of vaccine impacts. While both OLS and IV results offer evidence that more stringent Non-Pharmaceutical Interventions (NPIs) can reduce daily case counts within a short time-period, IV estimates are larger in magnitude. Hence, studies that fail to control for simultaneity bias might produce confounded estimates of the efficacy of NPIs. However, IV estimates should be treated as correlations given the possibility of other unobserved determinants of COVID-19 spread and mismeasurement of daily cases. With respect to specific policies, mandatory mask usage in indoor spaces and restrictions on business operations are significantly associated with lower daily cases. We also test the efficacy of different forecasting models. Our results suggest that Gradient Boosted Regression Trees (GBRT) and Seasonal Autoregressive-Integrated Moving Average (SARIMA) models produce more accurate short-run forecasts relative to Vector Auto Regressive (VAR), and Susceptible-Infected-Removed (SIR) epidemiology models. Forecasts from SIR models are also inferior to results from basic OLS regressions. However, predictions from models that are unable to correct for endogeneity bias should be treated with caution.

5.
PLoS One ; 18(2): e0277878, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36827382

RESUMO

While the impact of the COVID-19 pandemic has been widely studied, relatively fewer discussions about the sentimental reaction of the public are available. In this article, we scrape COVID-19 related tweets on the microblogging platform, Twitter, and examine the tweets from February 24, 2020 to October 14, 2020 in four Canadian cities (Toronto, Montreal, Vancouver, and Calgary) and four U.S. cities (New York, Los Angeles, Chicago, and Seattle). Applying the RoBERTa, Vader and NRC approaches, we evaluate sentiment intensity scores and visualize the results over different periods of the pandemic. Sentiment scores for the tweets concerning three anti-epidemic measures, "masks", "vaccine", and "lockdown", are computed for comparison. We explore possible causal relationships among the variables concerning tweet activities and sentiment scores of COVID-19 related tweets by integrating the echo state network method with convergent cross-mapping. Our analyses show that public sentiments about COVID-19 vary from time to time and from place to place, and are different with respect to anti-epidemic measures of "masks", "vaccines", and "lockdown". Evidence of the causal relationship is revealed for the examined variables, assuming the suggested model is feasible.


Assuntos
COVID-19 , Mídias Sociais , Vacinas , Humanos , Análise de Sentimentos , Pandemias , Canadá , Aprendizagem
6.
Biometrics ; 79(2): 1073-1088, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-35032335

RESUMO

Research of complex associations between a gene network and multiple responses has attracted increasing attention. A great challenge in analyzing genetic data is posited by the presence of the genetic network that is typically unknown. Moreover, mismeasurement of responses introduces additional complexity to distort usual inferential procedures. In this paper, we consider the problem with mixed binary and continuous responses that are subject to mismeasurement and associated with complex structured covariates. We first start with the case where data are precisely measured. We propose a generalized network structured model and develop a two-step inferential procedure. In the first step, we employ a Gaussian graphical model to facilitate the covariates network structure, and in the second step, we incorporate the estimated graphical structure of covariates and develop an estimating equation method. Furthermore, we extend the development to accommodating mismeasured responses. We consider two cases where the information on mismeasurement is either known or estimated from a validation sample. Theoretical results are established and numerical studies are conducted to evaluate the finite sample performance of the proposed methods. We apply the proposed method to analyze the outbred Carworth Farms White mice data arising from a genome-wide association study.


Assuntos
Redes Reguladoras de Genes , Estudo de Associação Genômica Ampla , Animais , Camundongos , Distribuição Normal
7.
Biometrics ; 79(2): 1089-1102, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-35261029

RESUMO

Zero-inflated count data arise frequently from genomics studies. Analysis of such data is often based on a mixture model which facilitates excess zeros in combination with a Poisson distribution, and various inference methods have been proposed under such a model. Those analysis procedures, however, are challenged by the presence of measurement error in count responses. In this article, we propose a new measurement error model to describe error-contaminated count data. We show that ignoring the measurement error effects in the analysis may generally lead to invalid inference results, and meanwhile, we identify situations where ignoring measurement error can still yield consistent estimators. Furthermore, we propose a Bayesian method to address the effects of measurement error under the zero-inflated Poisson model and discuss the identifiability issues. We develop a data-augmentation algorithm that is easy to implement. Simulation studies are conducted to evaluate the performance of the proposed method. We apply our method to analyze the data arising from a prostate adenocarcinoma genomic study.


Assuntos
Algoritmos , Modelos Estatísticos , Masculino , Humanos , Teorema de Bayes , Simulação por Computador , Distribuição de Poisson
8.
Brief Bioinform ; 23(6)2022 11 19.
Artigo em Inglês | MEDLINE | ID: mdl-36208175

RESUMO

Cell-type composition of intact bulk tissues can vary across samples. Deciphering cell-type composition and its changes during disease progression is an important step toward understanding disease pathogenesis. To infer cell-type composition, existing cell-type deconvolution methods for bulk RNA sequencing (RNA-seq) data often require matched single-cell RNA-seq (scRNA-seq) data, generated from samples with similar clinical conditions, as reference. However, due to the difficulty of obtaining scRNA-seq data in diseased samples, only limited scRNA-seq data in matched disease conditions are available. Using scRNA-seq reference to deconvolve bulk RNA-seq data from samples with different disease conditions may lead to a biased estimation of cell-type proportions. To overcome this limitation, we propose an iterative estimation procedure, MuSiC2, which is an extension of MuSiC, to perform deconvolution analysis of bulk RNA-seq data generated from samples with multiple clinical conditions where at least one condition is different from that of the scRNA-seq reference. Extensive benchmark evaluations indicated that MuSiC2 improved the accuracy of cell-type proportion estimates of bulk RNA-seq samples under different conditions as compared with the traditional MuSiC deconvolution. MuSiC2 was applied to two bulk RNA-seq datasets for deconvolution analysis, including one from human pancreatic islets and the other from human retina. We show that MuSiC2 improves current deconvolution methods and provides more accurate cell-type proportion estimates when the bulk and single-cell reference differ in clinical conditions. We believe the condition-specific cell-type composition estimates from MuSiC2 will facilitate the downstream analysis and help identify cellular targets of human diseases.


Assuntos
RNA , Análise de Célula Única , Humanos , RNA/genética , RNA-Seq , Análise de Célula Única/métodos , Perfilação da Expressão Gênica/métodos , Transcriptoma , Análise de Sequência de RNA/métodos
9.
Can Public Policy ; 48(1): 144-161, 2022 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-36039068

RESUMO

This study uses coronavirus disease 2019 (COVID-19) case counts and Google mobility data for 12 of Ontario's largest Public Health Units from Spring 2020 until the end of January 2021 to evaluate the effects of non-pharmaceutical interventions (NPIs; policy restrictions on business operations and social gatherings) and population mobility on daily cases. Instrumental variables (IV) estimation is used to account for potential simultaneity bias, because both daily COVID-19 cases and NPIs are dependent on lagged case numbers. IV estimates based on differences in lag lengths to infer causal estimates imply that the implementation of stricter NPIs and indoor mask mandates are associated with reductions in COVID-19 cases. Moreover, estimates based on Google mobility data suggest that increases in workplace attendance are correlated with higher case counts. Finally, from October 2020 to January 2021, daily Ontario forecasts from Box-Jenkins time-series models are more accurate than official forecasts and forecasts from a susceptible-infected-removed epidemiology model.


Cette étude cherche à évaluer les effets des interventions non pharmaceutiques (INPs; restrictions sur les activités commerciales et rassemblements sociaux) et de la mobilité de la population sur le nombre de cas d'infection par jour, en utilisant les nombres de cas d'infection par la maladie à coronavirus 2019 (COVID-19) et les données de mobilité de Google pour 12 des plus grands Bureaux de Santé publique de l'Ontario entre le printemps 2020 et la fin janvier 2021. La méthode des variables instrumentales (VI) permet de rendre compte d'un biais potentiel de simultanéité puisque les taux quotidiens de COVID-19 et les INPs dépendent, tous les deux, du nombre de cas décalés. Les estimations par les VI basées sur les différences de durée des décalages d'ajustement pour inférer des estimations causales impliquent que de plus strictes INPs et le port obligatoire du masque dans les endroits fermés sont associés à une réduction de cas d'infection. Par ailleurs, Les estimations basées sur les données de mobilité de Google montrent que la présence accrue sur le lieu du travail est corrélée avec un plus grand nombre de cas d'infection. Finalement, d'octobre 2020 à Janvier 2021, les prévisions faites à partir de modèles de Box-Jenkins en série chronologique s'avèrent plus précises que les prévisions officielles et que celles utilisant le modèle épidémiologique susceptible ­ infecté ­ retiré.

10.
Stat Methods Med Res ; 30(5): 1155-1186, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33635738

RESUMO

Bivariate responses with mixed continuous and binary variables arise commonly in applications such as clinical trials and genetic studies. Statistical methods based on jointly modeling continuous and binary variables have been available. However, such methods ignore the effects of response mismeasurement, a ubiquitous feature in applications. It has been well studied that in many settings, ignorance of mismeasurement in variables usually results in biased estimation. In this paper, we consider the setting with a bivariate outcome vector which contains a continuous component and a binary component both subject to mismeasurement. We propose estimating equation approaches to handle measurement error in the continuous response and misclassification in the binary response simultaneously. The proposed estimators are consistent and robust to certain model misspecification, provided regularity conditions. Extensive simulation studies confirm that the proposed methods successfully correct the biases resulting from the error-in-variables under various settings. The proposed methods are applied to analyze the outbred Carworth Farms White mice data arising from a genome-wide association study.


Assuntos
Estudo de Associação Genômica Ampla , Modelos Estatísticos , Animais , Viés , Causalidade , Simulação por Computador , Análise Custo-Benefício , Camundongos
11.
PLoS One ; 16(1): e0244536, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33465142

RESUMO

BACKGROUND: Since March 11, 2020 when the World Health Organization (WHO) declared the COVID-19 pandemic, the number of infected cases, the number of deaths, and the number of affected countries have climbed rapidly. To understand the impact of COVID-19 on public health, many studies have been conducted for various countries. To complement the available work, in this article we examine Canadian COVID-19 data for the period of March 18, 2020 to August 16, 2020 with the aim to forecast the dynamic trend in a short term. METHOD: We focus our attention on Canadian data and analyze the four provinces, Ontario, Alberta, British Columbia, and Quebec, which have the most severe situations in Canada. To build predictive models and conduct prediction, we employ three models, smooth transition autoregressive (STAR) models, neural network (NN) models, and susceptible-infected-removed (SIR) models, to fit time series data of confirmed cases in the four provinces separately. In comparison, we also analyze the data of daily infections in two states of USA, Texas and New York state, for the period of March 18, 2020 to August 16, 2020. We emphasize that different models make different assumptions which are basically difficult to validate. Yet invoking different models allows us to examine the data from different angles, thus, helping reveal the underlying trajectory of the development of COVID-19 in Canada. FINDING: The examinations of the data dated from March 18, 2020 to August 11, 2020 show that the STAR, NN, and SIR models may output different results, though the differences are small in some cases. Prediction over a short term period incurs smaller prediction variability than over a long term period, as expected. The NN method tends to outperform other two methods. All the methods forecast an upward trend in all the four Canadian provinces for the period of August 12, 2020 to August 23, 2020, though the degree varies from method to method. This research offers model-based insights into the pandemic evolvement in Canada.


Assuntos
COVID-19/epidemiologia , COVID-19/mortalidade , Canadá/epidemiologia , Demografia/estatística & dados numéricos , Humanos , Modelos Estatísticos , Mortalidade/tendências , Redes Neurais de Computação
12.
Stat Med ; 39(26): 3700-3719, 2020 11 20.
Artigo em Inglês | MEDLINE | ID: mdl-32914420

RESUMO

In genetic association studies, mixed effects models have been widely used in detecting the pleiotropy effects which occur when one gene affects multiple phenotype traits. In particular, bivariate mixed effects models are useful for describing the association of a gene with a continuous trait and a binary trait. However, such models are inadequate to feature the data with response mismeasurement, a characteristic that is often overlooked. It has been well studied that in univariate settings, ignorance of mismeasurement in variables usually results in biased estimation. In this paper, we consider the setting with a bivariate outcome vector which contains a continuous component and a binary component both subject to mismeasurement. We propose an induced likelihood approach and an EM algorithm method to handle measurement error in continuous response and misclassification in binary response simultaneously. Simulation studies confirm that the proposed methods successfully remove the bias induced from the response mismeasurement.


Assuntos
Viés , Estudos de Associação Genética , Simulação por Computador , Funções Verossimilhança , Fenótipo
13.
Support Care Cancer ; 28(7): 3409-3419, 2020 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-31781945

RESUMO

BACKGROUND: Smoking cessation is an integral part of cancer survivorship. To help improve survivorship education, clinicians need an understanding of patient awareness of the harms of continued smoking. METHODS: Cancer survivors from Princess Margaret Cancer Centre (Toronto, ON) were surveyed on their awareness of the harms of continued smoking on cancer-related outcomes. Multivariable logistic regression models assessed factors associated with awareness and whether awareness was associated with subsequent cessation among smokers at diagnosis. RESULTS: Among 1118 patients, 23% were current smokers pre-diagnosis and 54% subsequently quit; 25% had lung and 30% head and neck cancers. Many patients reported being unaware that continued smoking results in greater cancer surgical complications (53%), increased radiation side effects (62%), decreased quality of life during chemotherapy (51%), decreased chemotherapy or radiation efficacy (57%), increased risk of death (40%), and increased development of second primaries (38%). Being a current smoker was associated with greater lack of awareness of some of these smoking harms (aORs = 1.53-2.20, P < 0.001-0.02), as was exposure to any second-hand smoke (aORs = 1.45-1.53, P = 0.006-0.04) and being diagnosed with early stage cancer (aORs = 1.38-2.31, P < 0.001-0.06). Among current smokers, those with fewer pack-years, being treated for cure, or had a non-tobacco-related cancer were more likely unaware. Awareness that continued tobacco use worsen quality of life after chemotherapy was associated with subsequent cessation (aOR = 2.26, P = 0.006). CONCLUSIONS: Many cancer survivors are unaware that continued smoking can negatively impact cancer-related outcomes. The impact of educating patients about the potential harms of continued smoking when discussing treatment plans should be further evaluated.


Assuntos
Sobreviventes de Câncer/psicologia , Qualidade de Vida/psicologia , Abandono do Hábito de Fumar/métodos , Fumar/efeitos adversos , Feminino , Humanos , Masculino , Pessoa de Meia-Idade
14.
Head Neck ; 39(6): 1226-1233, 2017 06.
Artigo em Inglês | MEDLINE | ID: mdl-28323362

RESUMO

BACKGROUND: Body mass index (BMI) has been associated variably with head and neck cancer outcomes. We evaluated the association between BMI at either diagnosis or at early adulthood head and neck cancer outcomes. METHODS: Patients with invasive head and neck squamous cell cancer at Princess Margaret Cancer Centre in Toronto, Canada, were surveyed on tobacco and alcohol exposure, performance status, comorbidities, and BMI at diagnosis. A subset also had data collected for BMI at early adulthood. RESULTS: With a median follow-up of 2.5 years, in 1279 analyzed patients, being overweight (hazard ratio [HR], 0.55; 95% confidence interval [CI], 0.4-0.8; p = .001) at diagnosis was associated with improved survival when compared with individuals with normal weight. In contrast, underweight patients at diagnosis were associated with a worse outcome (HR, 1.89; 95% CI, 1.2-3.1; p < .01). CONCLUSION: Being underweight at diagnosis was an independent, adverse prognostic factor, whereas being overweight conferred better prognosis. BMI in early adulthood was not associated strongly with head and neck cancer outcomes. © 2017 Wiley Periodicals, Inc. Head Neck 39: 1226-1233, 2017.


Assuntos
Índice de Massa Corporal , Carcinoma de Células Escamosas/epidemiologia , Causas de Morte , Neoplasias de Cabeça e Pescoço/epidemiologia , Obesidade/epidemiologia , Adulto , Idoso , Institutos de Câncer , Carcinoma de Células Escamosas/patologia , Carcinoma de Células Escamosas/terapia , Estudos de Coortes , Comorbidade , Intervalo Livre de Doença , Feminino , Neoplasias de Cabeça e Pescoço/patologia , Neoplasias de Cabeça e Pescoço/terapia , Humanos , Masculino , Pessoa de Meia-Idade , Obesidade/diagnóstico , Ontário , Prognóstico , Modelos de Riscos Proporcionais , Medição de Risco , Carcinoma de Células Escamosas de Cabeça e Pescoço , Análise de Sobrevida , Pesquisa Translacional Biomédica
15.
Cancer Med ; 6(2): 361-373, 2017 02.
Artigo em Inglês | MEDLINE | ID: mdl-28074552

RESUMO

Polymorphisms in miRNA and miRNA pathway genes have been previously associated with cancer risk and outcome, but have not been studied in esophageal adenocarcinoma outcomes. Here, we evaluate candidate miRNA pathway polymorphisms in esophageal adenocarcinoma prognosis and attempt to validate them in an independent cohort of esophageal adenocarcinoma patients. Among 231 esophageal adenocarcinoma patients of all stages/treatment plans, 38 candidate genetic polymorphisms (17 biogenesis, 9 miRNA targets, 5 pri-miRNA, 7 pre-miRNA) were genotyped and analyzed. Cox proportional hazard models adjusted for sociodemographic and clinicopathological covariates helped assess the association of genetic polymorphisms with overall survival (OS) and progression-free survival (PFS). Significantly associated polymorphisms were then evaluated in an independent cohort of 137 esophageal adenocarcinoma patients. Among the 231 discovery cohort patients, 86% were male, median diagnosis age was 64 years, 34% were metastatic at diagnosis, and median OS and PFS were 20 and 12 months, respectively. GEMIN3 rs197412 (aHR = 1.37, 95%CI: [1.04-1.80]; P = 0.02), hsa-mir-124-1 rs531564 (aHR = 0.60, 95% CI: [0.53-0.90]; P = 0.05), and KIAA0423 rs1053667 (aHR = 0.51, 95% CI: [0.28-0.96]; P = 0.04) were found associated with OS. Furthermore, GEMIN3 rs197412 (aHR = 1.33, 95% CI: [1.03-1.74]; P = 0.03) and KRT81 rs3660 (aHR = 1.29, 95% CI: [1.01-1.64]; P = 0.04) were found associated with PFS. Although none of these polymorphisms were significant in the second cohort, hsa-mir-124-1 rs531564 and KIAA0423 rs1053667 had trends in the same direction; when both cohorts were combined together, GEMIN3 rs197412, hsa-mir-124-1 rs531564, and KIAA0423 rs1053667 remained significantly associated with OS. We demonstrate the association of multiple miRNA pathway polymorphisms with esophageal adenocarcinoma prognosis in a discovery cohort of patients, which did not validate in a separate cohort but had consistent associations in the pooled cohort. Larger studies are required to confirm/validate the prognostic value of these polymorphisms in esophageal adenocarcinoma.


Assuntos
Adenocarcinoma/genética , Neoplasias Esofágicas/genética , Redes Reguladoras de Genes , MicroRNAs/genética , Polimorfismo de Nucleotídeo Único , Adulto , Idoso , Idoso de 80 Anos ou mais , Intervalo Livre de Doença , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Metástase Neoplásica , Prognóstico , Modelos de Riscos Proporcionais , Análise de Sobrevida
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...