Search | VHL Regional Portal

1.

Prognostic importance of splicing-triggered aberrations of protein complex interfaces in cancer.

Newaz, Khalique; Schaefers, Christoph; Weisel, Katja; Baumbach, Jan; Frishman, Dmitrij.

NAR Genom Bioinform ; 6(3): lqae133, 2024 Sep.

Article in English | MEDLINE | ID: mdl-39328266

ABSTRACT

Aberrant alternative splicing (AS) is a prominent hallmark of cancer. AS can perturb protein-protein interactions (PPIs) by adding or removing interface regions encoded by individual exons. Identifying prognostic exon-exon interactions (EEIs) from PPI interfaces can help discover AS-affected cancer-driving PPIs that can serve as potential drug targets. Here, we assessed the prognostic significance of EEIs across 15 cancer types by integrating RNA-seq data with three-dimensional (3D) structures of protein complexes. By analyzing the resulting EEI network we identified patient-specific perturbed EEIs (i.e., EEIs present in healthy samples but absent from the paired cancer samples or vice versa) that were significantly associated with survival. We provide the first evidence that EEIs can be used as prognostic biomarkers for cancer patient survival. Our findings provide mechanistic insights into AS-affected PPI interfaces. Given the ongoing expansion of available RNA-seq data and the number of 3D structurally-resolved (or confidently predicted) protein complexes, our computational framework will help accelerate the discovery of clinically important cancer-promoting AS events.

2.

Multi-layer sequential network analysis improves protein 3D structural classification.

Newaz, Khalique; Piland, Jacob; Clark, Patricia L; Emrich, Scott J; Li, Jun; Milenkovic, Tijana.

Proteins ; 90(9): 1721-1731, 2022 09.

Article in English | MEDLINE | ID: mdl-35441395

ABSTRACT

Protein structural classification (PSC) is a supervised problem of assigning proteins into pre-defined structural (e.g., CATH or SCOPe) classes based on the proteins' sequence or 3D structural features. We recently proposed PSC approaches that model protein 3D structures as protein structure networks (PSNs) and analyze PSN-based protein features, which performed better than or comparable to state-of-the-art sequence or other 3D structure-based PSC approaches. However, existing PSN-based PSC approaches model the whole 3D structure of a protein as a static (i.e., single-layer) PSN. Because folding of a protein is a dynamic process, where some parts (i.e., sub-structures) of a protein fold before others, modeling the 3D structure of a protein as a PSN that captures the sub-structures might further help improve the existing PSC performance. Here, we propose to model 3D structures of proteins as multi-layer sequential PSNs that approximate 3D sub-structures of proteins, with the hypothesis that this will improve upon the current state-of-the-art PSC approaches that are based on single-layer PSNs (and thus upon the existing state-of-the-art sequence and other 3D structural approaches). Indeed, we confirm this on 72 datasets spanning ~44 000 CATH and SCOPe protein domains.

Subject(s)

Proteins , Amino Acid Sequence , Proteins/chemistry , Sequence Alignment

3.

Inference of a Dynamic Aging-related Biological Subnetwork via Network Propagation.

Newaz, Khalique; Milenkovic, Tijana.

IEEE/ACM Trans Comput Biol Bioinform ; 19(2): 974-988, 2022.

Article in English | MEDLINE | ID: mdl-32897864

ABSTRACT

Gene expression (GE)data capture valuable condition-specific information ("condition" can mean a biological process, disease stage, age, patient, etc.)However, GE analyses ignore physical interactions between gene products, i.e., proteins. Because proteins function by interacting with each other, and because biological networks (BNs)capture these interactions, BN analyses are promising. However, current BN data fail to capture condition-specific information. Recently, GE and BN data have been integrated using network propagation (NP)to infer condition-specific BNs. However, existing NP-based studies result in a static condition-specific subnetwork, even though cellular processes are dynamic. A dynamic process of our interest is human aging. We use prominent existing NP methods in a new task of inferring a dynamic rather than static condition-specific (aging-related)subnetwork. Then, we study evolution of network structure with age - we identify proteins whose network positions significantly change with age and predict them as new aging-related candidates. We validate the predictions via e.g., functional enrichment analyses and literature search. Dynamic network inference via NP yields higher prediction quality than the only existing method for inferring a dynamic aging-related BN, which does not use NP. Our data and code are available at https://nd.edu/~cone/dynetinf.

Subject(s)

Aging , Proteins , Aging/genetics , Humans , Proteins/genetics

4.

Towards future directions in data-integrative supervised prediction of human aging-related genes.

Li, Qi; Newaz, Khalique; Milenkovic, Tijana.

Bioinform Adv ; 2(1): vbac081, 2022.

Article in English | MEDLINE | ID: mdl-36699345

ABSTRACT

Motivation: Identification of human genes involved in the aging process is critical due to the incidence of many diseases with age. A state-of-the-art approach for this purpose infers a weighted dynamic aging-specific subnetwork by mapping gene expression (GE) levels at different ages onto the protein-protein interaction network (PPIN). Then, it analyzes this subnetwork in a supervised manner by training a predictive model to learn how network topologies of known aging- versus non-aging-related genes change across ages. Finally, it uses the trained model to predict novel aging-related gene candidates. However, the best current subnetwork resulting from this approach still yields suboptimal prediction accuracy. This could be because it was inferred using outdated GE and PPIN data. Here, we evaluate whether analyzing a weighted dynamic aging-specific subnetwork inferred from newer GE and PPIN data improves prediction accuracy upon analyzing the best current subnetwork inferred from outdated data. Results: Unexpectedly, we find that not to be the case. To understand this, we perform aging-related pathway and Gene Ontology term enrichment analyses. We find that the suboptimal prediction accuracy, regardless of which GE or PPIN data is used, may be caused by the current knowledge about which genes are aging-related being incomplete, or by the current methods for inferring or analyzing an aging-specific subnetwork being unable to capture all of the aging-related knowledge. These findings can potentially guide future directions towards improving supervised prediction of aging-related genes via -omics data integration. Availability and implementation: All data and code are available at zenodo, DOI: 10.5281/zenodo.6995045. Supplementary information: Supplementary data are available at Bioinformatics Advances online.

5.

Improved supervised prediction of aging-related genes via weighted dynamic network analysis.

Li, Qi; Newaz, Khalique; Milenkovic, Tijana.

BMC Bioinformatics ; 22(1): 520, 2021 Oct 25.

Article in English | MEDLINE | ID: mdl-34696741

ABSTRACT

BACKGROUND: This study focuses on the task of supervised prediction of aging-related genes from -omics data. Unlike gene expression methods for this task that capture aging-specific information but ignore interactions between genes (i.e., their protein products), or protein-protein interaction (PPI) network methods for this task that account for PPIs but the PPIs are context-unspecific, we recently integrated the two data types into an aging-specific PPI subnetwork, which yielded more accurate aging-related gene predictions. However, a dynamic aging-specific subnetwork did not improve prediction performance compared to a static aging-specific subnetwork, despite the aging process being dynamic. This could be because the dynamic subnetwork was inferred using a naive Induced subgraph approach. Instead, we recently inferred a dynamic aging-specific subnetwork using a methodologically more advanced notion of network propagation (NP), which improved upon Induced dynamic aging-specific subnetwork in a different task, that of unsupervised analyses of the aging process. RESULTS: Here, we evaluate whether our existing NP-based dynamic subnetwork will improve upon the dynamic as well as static subnetwork constructed by the Induced approach in the considered task of supervised prediction of aging-related genes. The existing NP-based subnetwork is unweighted, i.e., it gives equal importance to each of the aging-specific PPIs. Because accounting for aging-specific edge weights might be important, we additionally propose a weighted NP-based dynamic aging-specific subnetwork. We demonstrate that a predictive machine learning model trained and tested on the weighted subnetwork yields higher accuracy when predicting aging-related genes than predictive models run on the existing unweighted dynamic or static subnetworks, regardless of whether the existing subnetworks were inferred using NP or the Induced approach. CONCLUSIONS: Our proposed weighted dynamic aging-specific subnetwork and its corresponding predictive model could guide with higher confidence than the existing data and models the discovery of novel aging-related gene candidates for future wet lab validation.

Subject(s)

Protein Interaction Maps , Proteins , Gene Expression

6.

Network-based protein structural classification.

Newaz, Khalique; Ghalehnovi, Mahboobeh; Rahnama, Arash; Antsaklis, Panos J; Milenkovic, Tijana.

R Soc Open Sci ; 7(6): 191461, 2020 Jun.

Article in English | MEDLINE | ID: mdl-32742675

ABSTRACT

Experimental determination of protein function is resource-consuming. As an alternative, computational prediction of protein function has received attention. In this context, protein structural classification (PSC) can help, by allowing for determining structural classes of currently unclassified proteins based on their features, and then relying on the fact that proteins with similar structures have similar functions. Existing PSC approaches rely on sequence-based or direct three-dimensional (3D) structure-based protein features. By contrast, we first model 3D structures of proteins as protein structure networks (PSNs). Then, we use network-based features for PSC. We propose the use of graphlets, state-of-the-art features in many research areas of network science, in the task of PSC. Moreover, because graphlets can deal only with unweighted PSNs, and because accounting for edge weights when constructing PSNs could improve PSC accuracy, we also propose a deep learning framework that automatically learns network features from weighted PSNs. When evaluated on a large set of approximately 9400 CATH and approximately 12 800 SCOP protein domains (spanning 36 PSN sets), the best of our proposed approaches are superior to existing PSC approaches in terms of accuracy, with comparable running times. Our data and code are available at https://doi.org/10.5281/zenodo.3787922.

7.

Author Correction: GRAFENE: Graphlet-based alignment-free network approach integrates 3D structural and sequence (residue order) data to improve protein structural comparison.

Faisal, Fazle E; Newaz, Khalique; Chaney, Julie L; Li, Jun; Emrich, Scott J; Clark, Patricia L; Milenkovic, Tijana.

Sci Rep ; 10(1): 13455, 2020 Aug 10.

Article in English | MEDLINE | ID: mdl-32778675

ABSTRACT

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

8.

Network analysis of synonymous codon usage.

Newaz, Khalique; Wright, Gabriel; Piland, Jacob; Li, Jun; Clark, Patricia L; Emrich, Scott J; Milenkovic, Tijana.

Bioinformatics ; 36(19): 4876-4884, 2020 12 08.

Article in English | MEDLINE | ID: mdl-32609328

ABSTRACT

MOTIVATION: Most amino acids are encoded by multiple synonymous codons, some of which are used more rarely than others. Analyses of positions of such rare codons in protein sequences revealed that rare codons can impact co-translational protein folding and that positions of some rare codons are evolutionarily conserved. Analyses of their positions in protein 3-dimensional structures, which are richer in biochemical information than sequences alone, might further explain the role of rare codons in protein folding. RESULTS: We model protein structures as networks and use network centrality to measure the structural position of an amino acid. We first validate that amino acids buried within the structural core are network-central, and those on the surface are not. Then, we study potential differences between network centralities and thus structural positions of amino acids encoded by conserved rare, non-conserved rare and commonly used codons. We find that in 84% of proteins, the three codon categories occupy significantly different structural positions. We examine protein groups showing different codon centrality trends, i.e. different relationships between structural positions of the three codon categories. We see several cases of all proteins from our data with some structural or functional property being in the same group. Also, we see a case of all proteins in some group having the same property. Our work shows that codon usage is linked to the final protein structure and thus possibly to co-translational protein folding. AVAILABILITY AND IMPLEMENTATION: https://nd.edu/â¼cone/CodonUsage/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Codon Usage , Protein Folding , Amino Acid Sequence , Codon/genetics , Proteins/genetics

9.

GRAFENE: Graphlet-based alignment-free network approach integrates 3D structural and sequence (residue order) data to improve protein structural comparison.

Faisal, Fazle E; Newaz, Khalique; Chaney, Julie L; Li, Jun; Emrich, Scott J; Clark, Patricia L; Milenkovic, Tijana.

Sci Rep ; 7(1): 14890, 2017 11 02.

Article in English | MEDLINE | ID: mdl-29097661

ABSTRACT

Initial protein structural comparisons were sequence-based. Since amino acids that are distant in the sequence can be close in the 3-dimensional (3D) structure, 3D contact approaches can complement sequence approaches. Traditional 3D contact approaches study 3D structures directly and are alignment-based. Instead, 3D structures can be modeled as protein structure networks (PSNs). Then, network approaches can compare proteins by comparing their PSNs. These can be alignment-based or alignment-free. We focus on the latter. Existing network alignment-free approaches have drawbacks: 1) They rely on naive measures of network topology. 2) They are not robust to PSN size. They cannot integrate 3) multiple PSN measures or 4) PSN data with sequence data, although this could improve comparison because the different data types capture complementary aspects of the protein structure. We address this by: 1) exploiting well-established graphlet measures via a new network alignment-free approach, 2) introducing normalized graphlet measures to remove the bias of PSN size, 3) allowing for integrating multiple PSN measures, and 4) using ordered graphlets to combine the complementary PSN data and sequence (specifically, residue order) data. We compare synthetic networks and real-world PSNs more accurately and faster than existing network (alignment-free and alignment-based), 3D contact, or sequence approaches.

Subject(s)

Proteins/chemistry , Software , Algorithms , Amino Acids/chemistry , Computer Graphics , Databases, Protein , Models, Biological , Protein Conformation

10.

Identification of Major Signaling Pathways in Prion Disease Progression Using Network Analysis.

Newaz, Khalique; Sriram, K; Bera, Debajyoti.

PLoS One ; 10(12): e0144389, 2015.

Article in English | MEDLINE | ID: mdl-26646948

ABSTRACT

Prion diseases are transmissible neurodegenerative diseases that arise due to conformational change of normal, cellular prion protein (PrPC) to protease-resistant isofrom (rPrPSc). Deposition of misfolded PrpSc proteins leads to an alteration of many signaling pathways that includes immunological and apoptotic pathways. As a result, this culminates in the dysfunction and death of neuronal cells. Earlier works on transcriptomic studies have revealed some affected pathways, but it is not clear which is (are) the prime network pathway(s) that change during the disease progression and how these pathways are involved in crosstalks with each other from the time of incubation to clinical death. We perform network analysis on large-scale transcriptomic data of differentially expressed genes obtained from whole brain in six different mouse strain-prion strain combination models to determine the pathways involved in prion diseases, and to understand the role of crosstalks in disease propagation. We employ a notion of differential network centrality measures on protein interaction networks to identify the potential biological pathways involved. We also propose a crosstalk ranking method based on dynamic protein interaction networks to identify the core network elements involved in crosstalk with different pathways. We identify 148 DEGs (differentially expressed genes) potentially related to the prion disease progression. Functional association of the identified genes implicates a strong involvement of immunological pathways. We extract a bow-tie structure that is potentially dysregulated in prion disease. We also propose an ODE model for the bow-tie network. Predictions related to diseased condition suggests the downregulation of the core signaling elements (PI3Ks and AKTs) of the bow-tie network. In this work, we show using transcriptomic data that the neuronal dysfunction in prion disease is strongly related to the immunological pathways. We conclude that these immunological pathways occupy influential positions in the PFNs (protein functional networks) that are related to prion disease. Importantly, this functional network involvement is prevalent in all the five different mouse strain-prion strain combinations that we studied. We also conclude that the dysregulation of the core elements of the bow-tie structure, which belongs to PI3K-Akt signaling pathway, leads to dysregulation of the downstream components corresponding to other biological pathways.

Subject(s)

Prion Diseases/pathology , Signal Transduction , Disease Progression , Humans , Prion Diseases/genetics

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL