Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
1.
NPJ Precis Oncol ; 8(1): 106, 2024 May 18.
Article in English | MEDLINE | ID: mdl-38762647

ABSTRACT

Due to cancer's complex nature and variable response to therapy, precision oncology informed by omics sequence analysis has become the current standard of care. However, the amount of data produced for each patient makes it difficult to quickly identify the best treatment regimen. Moreover, limited data availability has hindered computational methods' abilities to learn patterns associated with effective drug-cell line pairs. In this work, we propose the use of contrastive learning to improve learned drug and cell line representations by preserving relationship structures associated with drug mechanisms of action and cell line cancer types. In addition to achieving enhanced performance relative to a state-of-the-art method, we find that classifiers using our learned representations exhibit a more balanced reliance on drug- and cell line-derived features when making predictions. This facilitates more personalized drug prioritizations that are informed by signals related to drug resistance.

2.
Pediatr Res ; 95(1): 146-155, 2024 Jan.
Article in English | MEDLINE | ID: mdl-37700164

ABSTRACT

BACKGROUND: Pathogenic GATA6 variants have been associated with congenital heart disease (CHD) and a spectrum of extracardiac abnormalities, including pancreatic agenesis, congenital diaphragmatic hernia, and developmental delay. However, the comprehensive genotype-phenotype correlation of pathogenic GATA6 variation in humans remains to be fully understood. METHODS: Exome sequencing was performed in a family where four members had CHD. In vitro functional analysis of the GATA6 variant was performed using immunofluorescence, western blot, and dual-luciferase reporter assay. RESULTS: A novel, heterozygous missense variant in GATA6 (c.1403 G > A; p.Cys468Tyr) segregated with affected members in a family with CHD, including three with persistent truncus arteriosus. In addition, one member had childhood onset diabetes mellitus (DM), and another had necrotizing enterocolitis (NEC) with intestinal perforation. The p.Cys468Tyr variant was located in the c-terminal zinc finger domain encoded by exon 4. The mutant protein demonstrated an abnormal nuclear localization pattern with protein aggregation and decreased transcriptional activity. CONCLUSIONS: We report a novel, familial GATA6 likely pathogenic variant associated with CHD, DM, and NEC with intestinal perforation. These findings expand the phenotypic spectrum of pathologic GATA6 variation to include intestinal abnormalities. IMPACT: Exome sequencing identified a novel heterozygous GATA6 variant (p.Cys468Tyr) that segregated in a family with CHD including persistent truncus arteriosus, atrial septal defects and bicuspid aortic valve. Additionally, affected members displayed extracardiac findings including childhood-onset diabetes mellitus, and uniquely, necrotizing enterocolitis with intestinal perforation in the first four days of life. In vitro functional assays demonstrated that GATA6 p.Cys468Tyr variant leads to cellular localization defects and decreased transactivation activity. This work supports the importance of GATA6 as a causative gene for CHD and expands the phenotypic spectrum of pathogenic GATA6 variation, highlighting neonatal intestinal perforation as a novel extracardiac phenotype.


Subject(s)
Diabetes Mellitus , Enterocolitis, Necrotizing , Fetal Diseases , Heart Defects, Congenital , Intestinal Perforation , Truncus Arteriosus, Persistent , Female , Infant, Newborn , Humans , Child , Heart Defects, Congenital/genetics , GATA6 Transcription Factor/genetics
3.
Pac Symp Biocomput ; 29: 306-321, 2024.
Article in English | MEDLINE | ID: mdl-38160288

ABSTRACT

Recently, drug repurposing has emerged as an effective and resource-efficient paradigm for AD drug discovery. Among various methods for drug repurposing, network-based methods have shown promising results as they are capable of leveraging complex networks that integrate multiple interaction types, such as protein-protein interactions, to more effectively identify candidate drugs. However, existing approaches typically assume paths of the same length in the network have equal importance in identifying the therapeutic effect of drugs. Other domains have found that same length paths do not necessarily have the same importance. Thus, relying on this assumption may be deleterious to drug repurposing attempts. In this work, we propose MPI (Modeling Path Importance), a novel network-based method for AD drug repurposing. MPI is unique in that it prioritizes important paths via learned node embeddings, which can effectively capture a network's rich structural information. Thus, leveraging learned embeddings allows MPI to effectively differentiate the importance among paths. We evaluate MPI against a commonly used baseline method that identifies anti-AD drug candidates primarily based on the shortest paths between drugs and AD in the network. We observe that among the top-50 ranked drugs, MPI prioritizes 20.0% more drugs with anti-AD evidence compared to the baseline. Finally, Cox proportional-hazard models produced from insurance claims data aid us in identifying the use of etodolac, nicotine, and BBB-crossing ACE-INHs as having a reduced risk of AD, suggesting such drugs may be viable candidates for repurposing and should be explored further in future studies.


Subject(s)
Alzheimer Disease , Humans , Alzheimer Disease/drug therapy , Drug Repositioning/methods , Computational Biology/methods
4.
ArXiv ; 2023 Oct 27.
Article in English | MEDLINE | ID: mdl-37961739

ABSTRACT

Recently, drug repurposing has emerged as an effective and resource-efficient paradigm for AD drug discovery. Among various methods for drug repurposing, network-based methods have shown promising results as they are capable of leveraging complex networks that integrate multiple interaction types, such as protein-protein interactions, to more effectively identify candidate drugs. However, existing approaches typically assume paths of the same length in the network have equal importance in identifying the therapeutic effect of drugs. Other domains have found that same length paths do not necessarily have the same importance. Thus, relying on this assumption may be deleterious to drug repurposing attempts. In this work, we propose MPI (Modeling Path Importance), a novel network-based method for AD drug repurposing. MPI is unique in that it prioritizes important paths via learned node embeddings, which can effectively capture a network's rich structural information. Thus, leveraging learned embeddings allows MPI to effectively differentiate the importance among paths. We evaluate MPI against a commonly used baseline method that identifies anti-AD drug candidates primarily based on the shortest paths between drugs and AD in the network. We observe that among the top-50 ranked drugs, MPI prioritizes 20.0% more drugs with anti-AD evidence compared to the baseline. Finally, Cox proportional-hazard models produced from insurance claims data aid us in identifying the use of etodolac, nicotine, and BBB-crossing ACE-INHs as having a reduced risk of AD, suggesting such drugs may be viable candidates for repurposing and should be explored further in future studies.

5.
Cell Rep Methods ; 2(9): 100293, 2022 09 19.
Article in English | MEDLINE | ID: mdl-36160050

ABSTRACT

In this work, we propose a new deep-learning model, MHCrank, to predict the probability that a peptide will be processed for presentation by MHC class I molecules. We find that the performance of our model is significantly higher than that of two previously published baseline methods: MHCflurry and netMHCpan. This improvement arises from utilizing both cleavage site-specific kernels and learned embeddings for amino acids. By visualizing site-specific amino acid enrichment patterns, we observe that MHCrank's top-ranked peptides exhibit enrichments at biologically relevant positions and are consistent with previous work. Furthermore, the cosine similarity matrix derived from MHCrank's learned embeddings for amino acids correlates highly with physiochemical properties that have been experimentally demonstrated to be instrumental in determining a peptide's favorability for processing. Altogether, the results reported in this work indicate that MHCrank demonstrates strong performance compared with existing methods and could have vast applicability in aiding drug and vaccine development.


Subject(s)
Histocompatibility Antigens Class I , Peptides , Histocompatibility Antigens Class I/chemistry , Peptides/chemistry , Amino Acids
6.
PLoS Genet ; 18(6): e1010236, 2022 06.
Article in English | MEDLINE | ID: mdl-35737725

ABSTRACT

Congenital heart disease (CHD) is a common group of birth defects with a strong genetic contribution to their etiology, but historically the diagnostic yield from exome studies of isolated CHD has been low. Pleiotropy, variable expressivity, and the difficulty of accurately phenotyping newborns contribute to this problem. We hypothesized that performing exome sequencing on selected individuals in families with multiple members affected by left-sided CHD, then filtering variants by population frequency, in silico predictive algorithms, and phenotypic annotations from publicly available databases would increase this yield and generate a list of candidate disease-causing variants that would show a high validation rate. In eight of the nineteen families in our study (42%), we established a well-known gene/phenotype link for a candidate variant or performed confirmation of a candidate variant's effect on protein function, including variants in genes not previously described or firmly established as disease genes in the body of CHD literature: BMP10, CASZ1, ROCK1 and SMYD1. Two plausible variants in different genes were found to segregate in the same family in two instances suggesting oligogenic inheritance. These results highlight the need for functional validation and demonstrate that in the era of next-generation sequencing, multiplex families with isolated CHD can still bring high yield to the discovery of novel disease genes.


Subject(s)
Exome , Heart Defects, Congenital , Bone Morphogenetic Proteins/genetics , DNA-Binding Proteins/genetics , Exome/genetics , Gene Frequency , Genetic Association Studies , Heart Defects, Congenital/genetics , Humans , Infant, Newborn , Pedigree , Transcription Factors/genetics , Exome Sequencing , rho-Associated Kinases/genetics
7.
Heart Rhythm ; 19(4): 676-685, 2022 04.
Article in English | MEDLINE | ID: mdl-34958940

ABSTRACT

BACKGROUND: Variation in lamin A/C results in a spectrum of clinical disease, including arrhythmias and cardiomyopathy. Benign variation is rare, and classification of LMNA missense variants via in silico prediction tools results in a high rate of variants of uncertain significance (VUSs). OBJECTIVE: The goal of this study was to use a machine learning (ML) approach for in silico prediction of LMNA pathogenic variation. METHODS: Genetic sequencing was performed on family members with conduction system disease, and patient cell lines were examined for LMNA expression. In silico predictions of conservation and pathogenicity of published LMNA variants were visualized with uniform manifold approximation and projection. K-means clustering was used to identify variant groups with similarly projected scores, allowing the generation of statistically supported risk categories. RESULTS: We discovered a novel LMNA variant (c.408C>A:p.Asp136Glu) segregating with conduction system disease in a multigeneration pedigree, which was reported as a VUS by a commercial testing company. Additional familial analysis and in vitro testing found it to be pathogenic, which prompted the development of an ML algorithm that used in silico predictions of pathogenicity for known LMNA missense variants. This identified 3 clusters of variation, each with a significantly different incidence of known pathogenic variants (38.8%, 15.0%, and 6.1%). Three hundred thirty-nine of 415 head/rod domain variants (81.7%), including p.Asp136Glu, were in clusters with highest proportions of pathogenic variants. CONCLUSION: An unsupervised ML method successfully identified clusters enriched for pathogenic LMNA variants including a novel variant associated with conduction system disease. Our ML method may assist in identifying high-risk VUS when familial testing is unavailable.


Subject(s)
Heart Diseases , Lamin Type A , Machine Learning , Cardiac Conduction System Disease/genetics , Heart Diseases/genetics , Humans , Lamin Type A/genetics , Pedigree
8.
Brief Bioinform ; 22(4)2021 07 20.
Article in English | MEDLINE | ID: mdl-33300547

ABSTRACT

The rapid development of single-cell RNA sequencing (scRNA-Seq) technology provides strong technical support for accurate and efficient analyzing single-cell gene expression data. However, the analysis of scRNA-Seq is accompanied by many obstacles, including dropout events and the curse of dimensionality. Here, we propose the scGMAI, which is a new single-cell Gaussian mixture clustering method based on autoencoder networks and the fast independent component analysis (FastICA). Specifically, scGMAI utilizes autoencoder networks to reconstruct gene expression values from scRNA-Seq data and FastICA is used to reduce the dimensions of reconstructed data. The integration of these computational techniques in scGMAI leads to outperforming results compared to existing tools, including Seurat, in clustering cells from 17 public scRNA-Seq datasets. In summary, scGMAI is an effective tool for accurately clustering and identifying cell types from scRNA-Seq data and shows the great potential of its applicative power in scRNA-Seq data analysis. The source code is available at https://github.com/QUST-AIBBDRC/scGMAI/.


Subject(s)
Algorithms , RNA-Seq , Single-Cell Analysis , Software
9.
Trends Pharmacol Sci ; 41(12): 1050-1065, 2020 12.
Article in English | MEDLINE | ID: mdl-33153777

ABSTRACT

Rapidly developing single-cell sequencing analyses produce more comprehensive profiles of the genomic, transcriptomic, and epigenomic heterogeneity of tumor subpopulations than do traditional bulk sequencing analyses. Moreover, single-cell techniques allow the response of a tumor to drug exposure to be more thoroughlyinvestigated. Deep learning (DL) models have successfully extracted features from complex bulk sequence data to predict drug responses. We review recent innovations in single-cell technologies and DL-based approaches related to drug sensitivity predictions. We believe that, by using insights from bulk sequencedata, deep transfer learning (DTL) can facilitate the use of single-cell data for training superior DL-based drug prediction models.


Subject(s)
Computational Biology , Deep Learning , Pharmaceutical Preparations , Epigenomics , Genomics , Pharmacology
10.
Comput Biol Med ; 123: 103899, 2020 08.
Article in English | MEDLINE | ID: mdl-32768046

ABSTRACT

Protein-protein interactions (PPIs) are involved with most cellular activities at the proteomic level, making the study of PPIs necessary to comprehending any biological process. Machine learning approaches have been explored, leading to more accurate and generalized PPIs predictions. In this paper, we propose a predictive framework called StackPPI. First, we use pseudo amino acid composition, Moreau-Broto, Moran and Geary autocorrelation descriptor, amino acid composition position-specific scoring matrix, Bi-gram position-specific scoring matrix and composition, transition and distribution to encode biologically relevant features. Secondly, we employ XGBoost to reduce feature noise and perform dimensionality reduction through gradient boosting and average gain. Finally, the optimized features that result are analyzed by StackPPI, a PPIs predictor we have developed from a stacked ensemble classifier consisting of random forest, extremely randomized trees and logistic regression algorithms. Five-fold cross-validation shows StackPPI can successfully predict PPIs with an ACC of 89.27%, MCC of 0.7859, AUC of 0.9561 on Helicobacter pylori, and with an ACC of 94.64%, MCC of 0.8934, AUC of 0.9810 on Saccharomyces cerevisiae. We find StackPPI improves protein interaction prediction accuracy on independent test sets compared to the state-of-the-art models. Finally, we highlight StackPPI's ability to infer biologically significant PPI networks. StackPPI's accurate prediction of functional pathways make it the logical choice for studying the underlying mechanism of PPIs, especially as it applies to drug design. The datasets and source code used to create StackPPI are available here: https://github.com/QUST-AIBBDRC/StackPPI/.


Subject(s)
Proteomics , Saccharomyces cerevisiae , Algorithms , Machine Learning , Software
SELECTION OF CITATIONS
SEARCH DETAIL
...