Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
Add more filters










Database
Language
Publication year range
1.
Comput Stat ; : 1-20, 2022 Sep 18.
Article in English | MEDLINE | ID: mdl-36157067

ABSTRACT

Given the costliness of HIV drug therapy research, it is important not only to maximize true positive rate (TPR) by identifying which genetic markers are related to drug resistance, but also to minimize false discovery rate (FDR) by reducing the number of incorrect markers unrelated to drug resistance. In this study, we propose a multiple testing procedure that unifies key concepts in computational statistics, namely Model-free Knockoffs, Bayesian variable selection, and the local false discovery rate. We develop an algorithm that utilizes the augmented data-Knockoff matrix and implement Bayesian Lasso. We then identify signals using test statistics based on Markov Chain Monte Carlo outputs and local false discovery rate. We test our proposed methods against non-bayesian methods such as Benjamini-Hochberg (BHq) and Lasso regression in terms TPR and FDR. Using numerical studies, we show the proposed method yields lower FDR compared to BHq and Lasso for certain cases, such as for low and equi-dimensional cases. We also discuss an application to an HIV-1 data set, which aims to be applied analyzing genetic markers linked to drug resistant HIV in the Philippines in future work.

2.
Front Neurosci ; 16: 836100, 2022.
Article in English | MEDLINE | ID: mdl-35401090

ABSTRACT

High-dimensionality is ubiquitous in various scientific fields such as imaging genetics, where a deluge of functional and structural data on brain-relevant genetic polymorphisms are investigated. It is crucial to identify which genetic variations are consequential in identifying neurological features of brain connectivity compared to merely random noise. Statistical inference in high-dimensional settings poses multiple challenges involving analytical and computational complexity. A widely implemented strategy in addressing inference goals is penalized inference. In particular, the role of the ridge penalty in high-dimensional prediction and estimation has been actively studied in the past several years. This study focuses on ridge-penalized tests in high-dimensional hypothesis testing problems by proposing and examining a class of methods for choosing the optimal ridge penalty. We present our findings on strategies to improve the statistical power of ridge-penalized tests and what determines the optimal ridge penalty for hypothesis testing. The application of our work to an imaging genetics study and biological research will be presented.

3.
Hum Mol Genet ; 28(24): 4208-4218, 2019 12 15.
Article in English | MEDLINE | ID: mdl-31691802

ABSTRACT

While much work has been done in associating differentially methylated positions (DMPs) to type 2 diabetes (T2D) across different populations, not much attention has been placed on identifying its possible functional consequences. We explored methylation changes in the peripheral blood of Filipinos with T2D and identified 177 associated DMPs. Most of these DMPs were associated with genes involved in metabolism, inflammation and the cell cycle. Three of these DMPs map to the TXNIP gene body, replicating previous findings from epigenome-wide association studies (EWAS) of T2D. The TXNIP downmethylation coincided with increased transcription at the 3' UTR, H3K36me3 histone markings and Sp1 binding, suggesting spurious transcription initiation at the TXNIP 3' UTR as a functional consequence of T2D methylation changes. We also explored potential epigenetic determinants to increased incidence of T2D in Filipino immigrants in the USA and found three DMPs associated with the interaction of T2D and immigration. Two of these DMPs were located near MAP2K7 and PRMT1, which may point towards dysregulated stress response and inflammation as a contributing factor to T2D among Filipino immigrants.


Subject(s)
Carrier Proteins/genetics , Diabetes Mellitus, Type 2/blood , Diabetes Mellitus, Type 2/genetics , Adult , Asian , Carrier Proteins/blood , Carrier Proteins/metabolism , DNA Methylation , Diabetes Mellitus, Type 2/metabolism , Epigenesis, Genetic , Female , Genome-Wide Association Study/methods , Humans , Male , Middle Aged
4.
Biometrics ; 74(2): 458-471, 2018 06.
Article in English | MEDLINE | ID: mdl-28940296

ABSTRACT

In recent mutation studies, analyses based on protein domain positions are gaining popularity over gene-centric approaches since the latter have limitations in considering the functional context that the position of the mutation provides. This presents a large-scale simultaneous inference problem, with hundreds of hypothesis tests to consider at the same time. This article aims to select significant mutation counts while controlling a given level of Type I error via False Discovery Rate (FDR) procedures. One main assumption is that the mutation counts follow a zero-inflated model in order to account for the true zeros in the count model and the excess zeros. The class of models considered is the Zero-inflated Generalized Poisson (ZIGP) distribution. Furthermore, we assumed that there exists a cut-off value such that smaller counts than this value are generated from the null distribution. We present several data-dependent methods to determine the cut-off value. We also consider a two-stage procedure based on screening process so that the number of mutations exceeding a certain value should be considered as significant mutations. Simulated and protein domain data sets are used to illustrate this procedure in estimation of the empirical null using a mixture of discrete distributions. Overall, while maintaining control of the FDR, the proposed two-stage testing procedure has superior empirical power.


Subject(s)
Biometry/methods , Data Interpretation, Statistical , Protein Domains , Statistical Distributions , DNA Mutational Analysis , Databases, Protein , Humans , Mutation Rate , Poisson Distribution
5.
PLoS Comput Biol ; 13(4): e1005428, 2017 04.
Article in English | MEDLINE | ID: mdl-28426665

ABSTRACT

The fight against cancer is hindered by its highly heterogeneous nature. Genome-wide sequencing studies have shown that individual malignancies contain many mutations that range from those commonly found in tumor genomes to rare somatic variants present only in a small fraction of lesions. Such rare somatic variants dominate the landscape of genomic mutations in cancer, yet efforts to correlate somatic mutations found in one or few individuals with functional roles have been largely unsuccessful. Traditional methods for identifying somatic variants that drive cancer are 'gene-centric' in that they consider only somatic variants within a particular gene and make no comparison to other similar genes in the same family that may play a similar role in cancer. In this work, we present oncodomain hotspots, a new 'domain-centric' method for identifying clusters of somatic mutations across entire gene families using protein domain models. Our analysis confirms that our approach creates a framework for leveraging structural and functional information encapsulated by protein domains into the analysis of somatic variants in cancer, enabling the assessment of even rare somatic variants by comparison to similar genes. Our results reveal a vast landscape of somatic variants that act at the level of domain families altering pathways known to be involved with cancer such as protein phosphorylation, signaling, gene regulation, and cell metabolism. Due to oncodomain hotspots' unique ability to assess rare variants, we expect our method to become an important tool for the analysis of sequenced tumor genomes, complementing existing methods.


Subject(s)
Computational Biology/methods , Mutation/genetics , Neoplasms/genetics , Oncogene Proteins/genetics , Protein Domains/genetics , Databases, Protein , Epidermal Growth Factor/genetics , Humans , Mitochondrial Proteins/genetics , Models, Molecular , Oncogene Proteins/classification , Protein Binding , ras Proteins/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...