Pesquisa | Portal Regional da BVS (teste)

Bioinformatic Analysis of Topoisomerase IIα Reveals Interdomain Interdependencies and Critical C-Terminal Domain Residues.

Endsley, Clark E; Moore, Kori A; Townsley, Thomas D; Durston, Kirk K; Deweese, Joseph E.

Int J Mol Sci ; 25(11)2024 May 23.

Artigo em Inglês | MEDLINE | ID: mdl-38891861

RESUMO

DNA Topoisomerase IIα (Top2A) is a nuclear enzyme that is a cancer drug target, and there is interest in identifying novel sites on the enzyme to inhibit cancer cells more selectively and to reduce off-target toxicity. The C-terminal domain (CTD) is one potential target, but it is an intrinsically disordered domain, which prevents structural analysis. Therefore, we set out to analyze the sequence of Top2A from 105 species using bioinformatic analysis, including the PSICalc algorithm, Shannon entropy analysis, and other approaches. Our results demonstrate that large (10th-order) interdependent clusters are found including non-proximal positions across the major domains of Top2A. Further, CTD-specific clusters of the third, fourth, and fifth order, including positions that had been previously analyzed via mutation and biochemical assays, were identified. Some of these clusters coincided with positions that, when mutated, either increased or decreased relaxation activity. Finally, sites of low Shannon entropy (i.e., low variation in amino acids at a given site) were identified and mapped as key positions in the CTD. Included in the low-entropy sites are phosphorylation sites and charged positions. Together, these results help to build a clearer picture of the critical positions in the CTD and provide potential sites/regions for further analysis.

Assuntos

Biologia Computacional , DNA Topoisomerases Tipo II , Domínios Proteicos , DNA Topoisomerases Tipo II/metabolismo , DNA Topoisomerases Tipo II/genética , DNA Topoisomerases Tipo II/química , Biologia Computacional/métodos , Humanos , Entropia , Sequência de Aminoácidos , Proteínas de Ligação a Poli-ADP-Ribose/metabolismo , Proteínas de Ligação a Poli-ADP-Ribose/genética , Proteínas de Ligação a Poli-ADP-Ribose/química , Fosforilação

PSICalc: a novel approach to identifying and ranking critical non-proximal interdependencies within the overall protein structure.

Townsley, Thomas D; Wilson, James T; Akers, Harrison; Bryant, Timothy; Cordova, Salvador; Wallace, T L; Durston, Kirk K; Deweese, Joseph E.

Bioinform Adv ; 2(1): vbac058, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-36699404

RESUMO

Motivation: AlphaFold has been a major advance in predicting protein structure, but still leaves the problem of determining which sub-molecular components of a protein are essential for it to carry out its function within the cell. Direct coupling analysis predicts two- and three-amino acid contacts, but there may be essential interdependencies that are not proximal within the 3D structure. The problem to be addressed is to design a computational method that locates and ranks essential non-proximal interdependencies within a protein involving five or more amino acids, using large, multiple sequence alignments (MSAs) for both globular and intrinsically unstructured proteins. Results: We developed PSICalc (Protein Subdomain Interdependency Calculator), a laptop-friendly, pattern-discovery, bioinformatics software tool that analyzes large MSAs for both structured and unstructured proteins, locates both proximal and non-proximal inter-dependent sites, and clusters them into pairwise (second order), third-order and higher-order clusters using a k-modes approach, and provides ranked results within minutes. To aid in visualizing these interdependencies, we developed a graphical user interface that displays these subdomain relationships as a polytree graph. To demonstrate, we provide examples of both proximal and non-proximal interdependencies documented for eukaryotic topoisomerase II including between the unstructured C-terminal domain and the N-terminal domain. Availability and implementation: https://github.com/jdeweeselab/psicalc-package. Supplementary information: Supplementary data are available at Bioinformatics Advances online.

Statistical discovery of site inter-dependencies in sub-molecular hierarchical protein structuring.

Durston, Kirk K; Chiu, David Ky; Wong, Andrew Kc; Li, Gary Cl.

EURASIP J Bioinform Syst Biol ; 2012(1): 8, 2012 Jul 13.

Artigo em Inglês | MEDLINE | ID: mdl-22793672

RESUMO

BACKGROUND: Much progress has been made in understanding the 3D structure of proteins using methods such as NMR and X-ray crystallography. The resulting 3D structures are extremely informative, but do not always reveal which sites and residues within the structure are of special importance. Recently, there are indications that multiple-residue, sub-domain structural relationships within the larger 3D consensus structure of a protein can be inferred from the analysis of the multiple sequence alignment data of a protein family. These intra-dependent clusters of associated sites are used to indicate hierarchical inter-residue relationships within the 3D structure. To reveal the patterns of associations among individual amino acids or sub-domain components within the structure, we apply a k-modes attribute (aligned site) clustering algorithm to the ubiquitin and transthyretin families in order to discover associations among groups of sites within the multiple sequence alignment. We then observe what these associations imply within the 3D structure of these two protein families. RESULTS: The k-modes site clustering algorithm we developed maximizes the intra-group interdependencies based on a normalized mutual information measure. The clusters formed correspond to sub-structural components or binding and interface locations. Applying this data-directed method to the ubiquitin and transthyretin protein family multiple sequence alignments as a test bed, we located numerous interesting associations of interdependent sites. These clusters were then arranged into cluster tree diagrams which revealed four structural sub-domains within the single domain structure of ubiquitin and a single large sub-domain within transthyretin associated with the interface among transthyretin monomers. In addition, several clusters of mutually interdependent sites were discovered for each protein family, each of which appear to play an important role in the molecular structure and/or function. CONCLUSIONS: Our results demonstrate that the method we present here using a k-modes site clustering algorithm based on interdependency evaluation among sites obtained from a sequence alignment of homologous proteins can provide significant insights into the complex, hierarchical inter-residue structural relationships within the 3D structure of a protein family.

Measuring the functional sequence complexity of proteins.

Durston, Kirk K; Chiu, David K Y; Abel, David L; Trevors, Jack T.

Theor Biol Med Model ; 4: 47, 2007 Dec 06.

Artigo em Inglês | MEDLINE | ID: mdl-18062814

RESUMO

BACKGROUND: Abel and Trevors have delineated three aspects of sequence complexity, Random Sequence Complexity (RSC), Ordered Sequence Complexity (OSC) and Functional Sequence Complexity (FSC) observed in biosequences such as proteins. In this paper, we provide a method to measure functional sequence complexity. METHODS AND RESULTS: We have extended Shannon uncertainty by incorporating the data variable with a functionality variable. The resulting measured unit, which we call Functional bit (Fit), is calculated from the sequence data jointly with the defined functionality variable. To demonstrate the relevance to functional bioinformatics, a method to measure functional sequence complexity was developed and applied to 35 protein families. Considerations were made in determining how the measure can be used to correlate functionality when relating to the whole molecule and sub-molecule. In the experiment, we show that when the proposed measure is applied to the aligned protein sequences of ubiquitin, 6 of the 7 highest value sites correlate with the binding domain. CONCLUSION: For future extensions, measures of functional bioinformatics may provide a means to evaluate potential evolving pathways from effects such as mutations, as well as analyzing the internal structural and functional relationships within the 3-D structure of proteins.

Assuntos

Biologia Computacional/métodos , Proteínas/fisiologia , Análise de Sequência de Proteína , Animais , Humanos , Família Multigênica , Estrutura Terciária de Proteína/genética , Proteínas/genética , Software , Incerteza

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA