Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
Add more filters










Database
Language
Publication year range
1.
Big Data ; 8(5): 363-378, 2020 10.
Article in English | MEDLINE | ID: mdl-33090027

ABSTRACT

Many real-world graphs are temporal, for example, in a social network, persons only interact at specific points in time. This temporal information directs dissemination processes on the graph, such as the spread of rumors, fake news, or diseases. However, the current state-of-the-art methods for supervised graph classification are mainly designed for static graphs and may not capture temporal information. Hence, they are not powerful enough to distinguish between graphs modeling different dissemination processes. We introduce a framework to lift standard graph kernels and graph-based neural networks to the temporal domain to address this. We explore three different approaches and investigate the trade-offs between loss of temporal information and efficiency. Moreover, to handle large-scale graphs, we propose stochastic variants of our kernels with provable approximation guarantees. We evaluate our methods, both kernel and neural architectures, on various real-world social networks to validate our theoretical findings. Our methods beat static approaches by a large margin in terms of accuracy while still being scalable to large graphs and data sets. Moreover, we show that our framework reaches high classification accuracy in scenarios where most of the dissemination process information is incomplete.


Subject(s)
Algorithms , Computer Graphics , Data Display
2.
Bioinformatics ; 36(8): 2417-2428, 2020 04 15.
Article in English | MEDLINE | ID: mdl-31742326

ABSTRACT

MOTIVATION: Secondary structure classification is one of the most important issues in structure-based analyses due to its impact on secondary structure prediction, structural alignment and protein visualization. There are still open challenges concerning helix and sheet assignments which are currently not addressed by a single multi-purpose software. RESULTS: We introduce SCOT (Secondary structure Classification On Turns) as a novel secondary structure element assignment software which supports the assignment of turns, right-handed α-, 310- and π-helices, left-handed α- and 310-helices, 2.27- and polyproline II helices, ß-sheets and kinks. We demonstrate that the introduction of helix Purity values enables a clear differentiation between helix classes. SCOT's unique strengths are highlighted by comparing it to six state-of-the-art methods (DSSP, STRIDE, ASSP, SEGNO, DISICL and SHAFT). The assignment approaches were compared concerning geometric consistency, protein structure quality and flexibility dependency and their impact on secondary structure element-based structural alignments. We show that only SCOT's combination of hydrogen bonds, geometric criteria and dihedral angles enables robust assignments independent of the structure quality and flexibility. We demonstrate that this combination and the elaborate kink detection lead to SCOT's clear superiority for protein alignments. As the resulting helices and strands are provided in a PDB conform output format, they can immediately be used for structure alignment algorithms. Taken together, the application of our new method and the straight-forward visualization using the accompanying PyMOL scripts enable the comprehensive analysis of regular backbone geometries in proteins. AVAILABILITY AND IMPLEMENTATION: https://this-group.rocks. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Proteins , Software , Algorithms , Hydrogen Bonding , Protein Structure, Secondary
3.
ChemMedChem ; 13(6): 532-539, 2018 03 20.
Article in English | MEDLINE | ID: mdl-29392860

ABSTRACT

A common issue during drug design and development is the discovery of novel scaffolds for protein targets. On the one hand the chemical space of purchasable compounds is rather limited; on the other hand artificially generated molecules suffer from a grave lack of accessibility in practice. Therefore, we generated a novel virtual library of small molecules which are synthesizable from purchasable educts, called CHIPMUNK (CHemically feasible In silico Public Molecular UNiverse Knowledge base). Altogether, CHIPMUNK covers over 95 million compounds and encompasses regions of the chemical space that are not covered by existing databases. The coverage of CHIPMUNK exceeds the chemical space spanned by the Lipinski rule of five to foster the exploration of novel and difficult target classes. The analysis of the generated property space reveals that CHIPMUNK is well suited for the design of protein-protein interaction inhibitors (PPIIs). Furthermore, a recently developed structural clustering algorithm (StruClus) for big data was used to partition the sub-libraries into meaningful subsets and assist scientists to process the large amount of data. These clustered subsets also contain the target space based on ChEMBL data which was included during clustering.


Subject(s)
Proteins/chemistry , Small Molecule Libraries/chemistry , Small Molecule Libraries/pharmacology , Algorithms , Chemistry, Pharmaceutical , Cluster Analysis , Drug Design , Protein Binding/drug effects , Proteins/antagonists & inhibitors , Small Molecule Libraries/chemical synthesis
4.
J Cheminform ; 9(1): 28, 2017 May 11.
Article in English | MEDLINE | ID: mdl-29086162

ABSTRACT

The era of big data is influencing the way how rational drug discovery and the development of bioactive molecules is performed and versatile tools are needed to assist in molecular design workflows. Scaffold Hunter is a flexible visual analytics framework for the analysis of chemical compound data and combines techniques from several fields such as data mining and information visualization. The framework allows analyzing high-dimensional chemical compound data in an interactive fashion, combining intuitive visualizations with automated analysis methods including versatile clustering methods. Originally designed to analyze the scaffold tree, Scaffold Hunter is continuously revised and extended. We describe recent extensions that significantly increase the applicability for a variety of tasks.

5.
Mol Inform ; 32(11-12): 964-75, 2013 Dec.
Article in English | MEDLINE | ID: mdl-27481142

ABSTRACT

The growing interest in chemogenomics approaches over the last years has led to an increasing amount of data regarding chemical and the corresponding biological activity space. The resulting data, collected in either in-house or public databases, need to be analyzed efficiently to speed-up the increasingly difficult task of drug discovery. Unfortunately, the discovery of new chemical entities or new targets for known drugs ('drug repurposing') is not suitable to a fully automated analysis or a simple drill down process. Visual interactive interfaces that allow to explore chemical space in a systematic manner and facilitate analytical reasoning can help to overcome these problems. Scaffold Hunter is a tool for the visual analysis of chemical compound databases that provides integrated visualization and analysis of biological activity data and fosters the interactive exploration of data imported from a variety of sources. We describe the features and illustrate the use by means of an exemplary analysis workflow.

6.
Nat Chem Biol ; 5(8): 581-3, 2009 Aug.
Article in English | MEDLINE | ID: mdl-19561620

ABSTRACT

We describe Scaffold Hunter, a highly interactive computer-based tool for navigation in chemical space that fosters intuitive recognition of complex structural relationships associated with bioactivity. The program reads compound structures and bioactivity data, generates compound scaffolds, correlates them in a hierarchical tree-like arrangement, and annotates them with bioactivity. Brachiation along tree branches from structurally complex to simple scaffolds allows identification of new ligand types. We provide proof of concept for pyruvate kinase.


Subject(s)
Chemistry, Pharmaceutical/methods , Computer Simulation , Databases, Factual , Models, Molecular , Small Molecule Libraries/chemistry , Software
7.
Bioinformatics ; 25(6): 758-64, 2009 Mar 15.
Article in English | MEDLINE | ID: mdl-19176558

ABSTRACT

MOTIVATION: Proteomics has particularly evolved to become of high interest for the field of biomarker discovery and drug development. Especially the combination of liquid chromatography and mass spectrometry (LC/MS) has proven to be a powerful technique for analyzing protein mixtures. Clinically orientated proteomic studies will have to compare hundreds of LC/MS runs at a time. In order to compare different runs, sophisticated preprocessing steps have to be performed. An important step is the retention time (rt) alignment of LC/MS runs. Especially non-linear shifts in the rt between pairs of LC/MS runs make this a crucial and non-trivial problem. RESULTS: For the purpose of demonstrating the particular importance of correcting non-linear rt shifts, we evaluate and compare different alignment algorithms. We present and analyze two versions of a new algorithm that is based on regression techniques, once assuming and estimating only linear shifts and once also allowing for the estimation of non-linear shifts. As an example for another type of alignment method we use an established alignment algorithm based on shifting vectors that we adapted to allow for correcting non-linear shifts also. In a simulation study, we show that rt alignment procedures that can estimate non-linear shifts yield clearly better alignments. This is even true under mild non-linear deviations. AVAILABILITY: R code for the regression-based alignment methods and simulated datasets are available at http://www.statistik.tu-dortmund.de/genetik-publikationen-alignment.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Algorithms , Chromatography, Liquid/methods , Mass Spectrometry/methods , Proteomics/methods , Computer Simulation , Proteins/chemistry , Proteome/chemistry
8.
J Integr Bioinform ; 5(2)2008 Aug 25.
Article in English | MEDLINE | ID: mdl-20134061

ABSTRACT

Proteins and their interactions are essential for the functioning of all organisms and for understanding biological processes. Alternative splicing is an important molecular mechanism for increasing the protein diversity in eukaryotic cells. Splicing events that alter the protein structure and the domain composition can be responsible for the regulation of protein interactions and the functional diversity of different tissues. Discovering the occurrence of splicing events and studying protein isoforms have become feasible using Affymetrix Exon Arrays. Therefore, we have developed the versatile Cytoscape plugin DomainGraph that allows for the visual analysis of protein domain interaction networks and their integration with exon expression data. Protein domains affected by alternative splicing are highlighted and splicing patterns can be compared.


Subject(s)
Alternative Splicing , Computational Biology/methods , Protein Interaction Domains and Motifs/genetics , Protein Interaction Mapping/methods , Proteins/genetics , Exons , Protein Isoforms/genetics , Protein Isoforms/metabolism , Proteins/metabolism , RNA Splicing
SELECTION OF CITATIONS
SEARCH DETAIL
...