Pesquisa | Portal Regional da BVS

1.

Data integration for identification of important transcription factors of STAT6-mediated cell fate decisions.

Jargosch, M; Kröger, S; Gralinska, E; Klotz, U; Fang, Z; Chen, W; Leser, U; Selbig, J; Groth, D; Baumgrass, R.

Genet Mol Res ; 15(2)2016 Jun 24.

Artigo em Inglês | MEDLINE | ID: mdl-27420972

RESUMO

Data integration has become a useful strategy for uncovering new insights into complex biological networks. We studied whether this approach can help to delineate the signal transducer and activator of transcription 6 (STAT6)-mediated transcriptional network driving T helper (Th) 2 cell fate decisions. To this end, we performed an integrative analysis of publicly available RNA-seq data of Stat6-knockout mouse studies together with STAT6 ChIP-seq data and our own gene expression time series data during Th2 cell differentiation. We focused on transcription factors (TFs), cytokines, and cytokine receptors and delineated 59 positively and 41 negatively STAT6-regulated genes, which were used to construct a transcriptional network around STAT6. The network illustrates that important and well-known TFs for Th2 cell differentiation are positively regulated by STAT6 and act either as activators for Th2 cells (e.g., Gata3, Atf3, Satb1, Nfil3, Maf, and Pparg) or as suppressors for other Th cell subpopulations such as Th1 (e.g., Ar), Th17 (e.g., Etv6), or iTreg (e.g., Stat3 and Hif1a) cells. Moreover, our approach reveals 11 TFs (e.g., Atf5, Creb3l2, and Asb2) with unknown functions in Th cell differentiation. This fact together with the observed enrichment of asthma risk genes among those regulated by STAT6 underlines the potential value of the data integration strategy used here. Thus, our results clearly support the opinion that data integration is a useful tool to delineate complex physiological processes.

Assuntos

Diferenciação Celular/genética , Redes Reguladoras de Genes , Fator de Transcrição STAT6/genética , Células Th2/metabolismo , Animais , Citocinas/metabolismo , Camundongos , Receptores de Citocinas/metabolismo , Fator de Transcrição STAT6/metabolismo , Integração de Sistemas , Células Th2/citologia

2.

Predicting impaired glucose metabolism in women with polycystic ovary syndrome by decision tree modelling.

Möhlig, M; Flöter, A; Spranger, J; Weickert, M O; Schill, T; Schlösser, H W; Brabant, G; Pfeiffer, A F H; Selbig, J; Schöfl, C.

Diabetologia ; 49(11): 2572-9, 2006 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-16972044

RESUMO

AIMS/HYPOTHESIS: Polycystic ovary syndrome (PCOS) is a risk factor of type 2 diabetes. Screening for impaired glucose metabolism (IGM) with an OGTT has been recommended, but this is relatively time-consuming and inconvenient. Thus, a strategy that could minimise the need for an OGTT would be beneficial. MATERIALS AND METHODS: Consecutive PCOS patients (n=118) with fasting glucose <6.1 mmol/l were included in the study. Parameters derived from medical history, clinical examination and fasting blood samples were assessed by decision tree modelling for their ability to discriminate women with IGM (2-h OGTT value >/=7.8 mmol/l) from those with NGT. RESULTS: According to the OGTT results, 93 PCOS women had NGT and 25 had IGM. The best decision tree consisted of HOMA-IR, the proinsulin:insulin ratio, proinsulin, 17-OH progesterone and the ratio of luteinising hormone:follicle-stimulating hormone. This tree identified 69 women with NGT. The remaining 49 women included all women with IGM (100% sensitivity, 74% specificity to detect IGM). Pruning this tree to three levels still identified 53 women with NGT (100% sensitivity, 57% specificity to detect IGM). Restricting the data matrix used for tree modelling to medical history and clinical parameters produced a tree using BMI, waist circumference and WHR. Pruning this tree to two levels separated 27 women with NGT (100% sensitivity, 29% specificity to detect IGM). The validity of both trees was tested by a leave-10%-out cross-validation. CONCLUSIONS/INTERPRETATION: Decision trees are useful tools for separating PCOS women with NGT from those with IGM. They can be used for stratifying the metabolic screening of PCOS women, whereby the number of OGTTs can be markedly reduced.

Assuntos

Intolerância à Glucose/etiologia , Síndrome do Ovário Policístico/sangue , Adulto , Glicemia/metabolismo , Índice de Massa Corporal , Estudos de Coortes , Árvores de Decisões , Feminino , Intolerância à Glucose/sangue , Teste de Tolerância a Glucose , Hormônios/sangue , Humanos , Modelos Estatísticos , Valor Preditivo dos Testes

3.

Threshold extraction in metabolite concentration data.

Flöter, A; Nicolas, J; Schaub, T; Selbig, J.

Bioinformatics ; 20(10): 1491-4, 2004 Jul 10.

Artigo em Inglês | MEDLINE | ID: mdl-15231540

RESUMO

MOTIVATION: Continued development of analytical techniques based on gas chromatography and mass spectrometry now facilitates the generation of larger sets of metabolite concentration data. An important step towards the understanding of metabolite dynamics is the recognition of stable states where metabolite concentrations exhibit a simple behaviour. Such states can be characterized through the identification of significant thresholds in the concentrations. But general techniques for finding discretization thresholds in continuous data prove to be practically insufficient for detecting states due to the weak conditional dependences in concentration data. RESULTS: We introduce a method of recognizing states in the framework of decision tree induction. It is based upon a global analysis of decision forests where stability and quality are evaluated. It leads to the detection of thresholds that are both comprehensible and robust. Applied to metabolite concentration data, this method has led to the discovery of hidden states in the corresponding variables. Some of these reflect known properties of the biological experiments, and others point to putative new states. AVAILABILITY: An implementation of this approach can be obtained from the authors upon request.

Assuntos

Algoritmos , Limiar Diferencial/fisiologia , Regulação da Expressão Gênica de Plantas/fisiologia , Modelos Biológicos , Proteínas de Plantas/metabolismo , Transdução de Sinais/fisiologia , Solanum tuberosum/metabolismo , Simulação por Computador , Metabolismo Energético/fisiologia , Perfilação da Expressão Gênica/métodos , Homeostase/fisiologia

4.

Metabolite fingerprinting: detecting biological features by independent component analysis.

Scholz, M; Gatzek, S; Sterling, A; Fiehn, O; Selbig, J.

Bioinformatics ; 20(15): 2447-54, 2004 Oct 12.

Artigo em Inglês | MEDLINE | ID: mdl-15087312

RESUMO

MOTIVATION: Metabolite fingerprinting is a technology for providing information from spectra of total compositions of metabolites. Here, spectra acquisitions by microchip-based nanoflow-direct-infusion QTOF mass spectrometry, a simple and high throughput technique, is tested for its informative power. As a simple test case we are using Arabidopsis thaliana crosses. The question is how metabolite fingerprinting reflects the biological background. In many applications the classical principal component analysis (PCA) is used for detecting relevant information. Here a modern alternative is introduced-the independent component analysis (ICA). Due to its independence condition, ICA is more suitable for our questions than PCA. However, ICA has not been developed for a small number of high-dimensional samples, therefore a strategy is needed to overcome this limitation. RESULTS: To apply ICA successfully it is essential first to reduce the high dimension of the dataset, by using PCA. The number of principal components determines the quality of ICA significantly, therefore we propose a criterion for estimating the optimal dimension automatically. The kurtosis measure is used to order the extracted components to our interest. Applied to our A. thaliana data, ICA detects three relevant factors, two biological and one technical, and clearly outperforms the PCA.

Assuntos

Algoritmos , Proteínas de Arabidopsis/metabolismo , Arabidopsis/metabolismo , Perfilação da Expressão Gênica/métodos , Procedimentos Analíticos em Microchip/métodos , Técnicas Analíticas Microfluídicas/métodos , Espectrometria de Massas por Ionização por Electrospray/métodos , Proteínas de Arabidopsis/análise , Modelos Biológicos , Modelos Estatísticos , Ética Baseada em Princípios

5.

The mutual information: detecting and evaluating dependencies between variables.

Steuer, R; Kurths, J; Daub, C O; Weise, J; Selbig, J.

Bioinformatics ; 18 Suppl 2: S231-40, 2002.

Artigo em Inglês | MEDLINE | ID: mdl-12386007

RESUMO

MOTIVATION: Clustering co-expressed genes usually requires the definition of 'distance' or 'similarity' between measured datasets, the most common choices being Pearson correlation or Euclidean distance. With the size of available datasets steadily increasing, it has become feasible to consider other, more general, definitions as well. One alternative, based on information theory, is the mutual information, providing a general measure of dependencies between variables. While the use of mutual information in cluster analysis and visualization of large-scale gene expression data has been suggested previously, the earlier studies did not focus on comparing different algorithms to estimate the mutual information from finite data. RESULTS: Here we describe and review several approaches to estimate the mutual information from finite datasets. Our findings show that the algorithms used so far may be quite substantially improved upon. In particular when dealing with small datasets, finite sample effects and other sources of potentially misleading results have to be taken into account.

Assuntos

Algoritmos , Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Modelos Genéticos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Software , Simulação por Computador , Humanos , Modelos Estatísticos

6.

Structural analysis of the DNA-binding domain of alternatively spliced steroid receptors.

Wickert, L; Selbig, J.

J Endocrinol ; 173(3): 429-36, 2002 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-12065232

RESUMO

We have generated 24 DNA-binding domain structure models of alternatively spliced or mutated steroid receptor variants by homology-based modeling. Members of the steroid receptor family dispose of a DNA-binding domain which is built from two zinc fingers with a preserved sequence homology of about 90%. Data from crystallographic analysis of the glucocorticoid receptor DNA-binding domain are therefore appropriate to serve as a template structure. We inserted or deleted amino acid residues in order to study the structural details of the glucocorticoid, mineralocorticoid, and androgen receptor splice variants. The receptor variants are created by QUANTA- and MODELLER-based modeling. Subsequently, the resulting energy-minimized models were compared with each other and with the wild-type receptor respectively. A prediction for the receptor function based mainly on the preservation or destruction of secondary structures has been carried out. The simulations showed that amino acid insertions of one, four or nine additional residues of existing steroid receptor splice variants were tolerated without destruction of the secondary structure. In contrast, a deletion of four amino acids at the splice site junction leads to modifications in the secondary structure of the DNA-recognition helix which apparently disturb the receptor-DNA interaction. Furthermore, an insertion of 23 amino acid residues between the zinc finger of the androgen receptor leads to a large loop with an additional alpha-helical structure which seems to disconnect a specific contact from its hormone response element. Thereafter, the prediction of receptor function based on the molecular models was compared with the available experimental results from the in vitro function tests. We obtained a close correspondence between the molecular modeling-based predictions and the conclusions of receptor function drawn from in vitro studies.

Assuntos

Processamento Alternativo , Simulação por Computador , Modelos Moleculares , Receptores de Esteroides/genética , Estrutura Secundária de Proteína , Alinhamento de Sequência

7.

Differential mRNA expression of the two mineralocorticoid receptor splice variants within the human brain: structure analysis of their different DNA binding domains.

Wickert, L; Selbig, J; Watzka, M; Stoffel-Wagner, B; Schramm, J; Bidlingmaier, F; Ludwig, M.

J Neuroendocrinol ; 12(9): 867-73, 2000 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-10971811

RESUMO

In human brain tissue, cortisol action, at basal concentrations, is mediated by the mineralocorticoid receptor (MR). An in-frame insertion of 12 bp in the MR-DNA-binding domain due to alternative splice site usage between exons 3 and 4 results in an MR mRNA splice variant (MR+4) encoding a receptor protein with four additional amino acids compared to the wild-type MR protein. To elucidate the questions of sex, age, and/or tissue dependent differences of the relative amount of the two mRNA subtypes, we examined 131 fresh human brain tissue samples from temporal and frontal lobe or hippocampus. One hundred and twenty samples were obtained from patients with epilepsy and 11 samples from patients with brain tumours. A small but significant difference of the MR+4 mRNA splice variant proportions in cortex (9.5 +/- 0.8%) and subcortical white matter (6.6 +/- 0.7%) of the temporal lobe could be detected, indicating differential MR splice variant expression within these brain areas. Moreover, the splice variant ratios in samples of the temporal lobe cortex collected from patients with epilepsy differed from samples of patients with brain tumours. These data point to an altered expression of the MR splice variants in epilepsy, and strengthen the supposition of a tissue specific alternative splicing of the MR mRNA. The frequent occurrence of the MR+4 transcript raises the question of its functional significance. For this reason, an MR+4 DNA-binding-domain structure model was generated by computer-based homology modelling based on the known glucocorticoid receptor structure. The data obtained revealed no distorting effect of the inserted four amino acids on the adjacent secondary structures, thereby suggesting that both zinc fingers retain their function. The resulting structure of the MR+4 model leads to the supposition that the receptor retains its function. Moreover, databank analysis with respect to this kind of steroid receptor variation and our own sequence data of the closely related progesterone receptor sustained the hypothesis that only corticosteroid receptors were affected by this alternative splicing event.

Assuntos

Processamento Alternativo , Química Encefálica , DNA/metabolismo , Expressão Gênica , RNA Mensageiro/análise , Receptores de Mineralocorticoides/genética , Adulto , Sequência de Aminoácidos , Sítios de Ligação , Criança , Pré-Escolar , Feminino , Lobo Frontal/química , Hipocampo/química , Humanos , Masculino , Pessoa de Meia-Idade , Modelos Moleculares , Dados de Sequência Molecular , Receptores de Mineralocorticoides/química , Receptores de Mineralocorticoides/metabolismo , Alinhamento de Sequência , Lobo Temporal/química

8.

Decision tree-based formation of consensus protein secondary structure prediction.

Selbig, J; Mevissen, T; Lengauer, T.

Bioinformatics ; 15(12): 1039-46, 1999 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-10745994

RESUMO

MOTIVATION: Prediction of protein secondary structure provides information that is useful for other prediction methods like fold recognition and ab initio 3D prediction. A consensus prediction constructed from the output of several methods should yield more reliable results than each of the individual methods. METHOD: We present an approach that reveals subtle but systematic differences in the output of different secondary structure prediction methods allowing the derivation of coherent consensus predictions. The method uses a machine learning technique that builds decision trees from existing data. RESULTS: The first results of our analysis show that consensus prediction of protein secondary structure may be improved both quantitatively and qualitatively.

Assuntos

Árvores de Decisões , Estrutura Secundária de Proteína/genética , Algoritmos , Inteligência Artificial , Reprodutibilidade dos Testes , Integração de Sistemas

9.

Relationships between protein sequence and structure patterns based on residue contacts.

Selbig, J; Argos, P.

Proteins ; 31(2): 172-85, 1998 May 01.

Artigo em Inglês | MEDLINE | ID: mdl-9593191

RESUMO

The identification of correlations between sequence patterns and structural motifs is a prerequisite in the development of protein structure prediction methods. The prediction accuracy indicates whether these correlations are discerned. We present an approach to identify long-range relationships between sequence patterns and structural motifs by varying the granulation of the structure description. Since interaction among residues is a major determinant in protein folding, we consider contact environments formed by two triplets of three sequentially neighboring residues and described by vectors whose components express contact strengths on an atomic level. Through testing various classification schemes, including their resolution and optimizing parameters, discernible relationships between sequences and folds are explored. About ten structural contact states, together with information from noncontacting regions, could improve the accuracy of contact prediction.

Assuntos

Modelos Moleculares , Conformação Proteica , Fenômenos Químicos , Físico-Química , Árvores de Decisões , Matemática , Estrutura Secundária de Proteína

10.

Contact pattern-induced pair potentials for protein fold recognition.

Selbig, J.

Protein Eng ; 8(4): 339-51, 1995 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-7567919

RESUMO

The protein structure prediction problem is considered as a problem of fitting a sequence into a folding motif. We focus on finding an approximative structure representation providing the best preferences or contact energies. A 2-D structure description in the form of specific contact matrices is used. The main features of our approach are (i) only contacts involved in characteristic interaction patterns are considered, (ii) amino acid pair preferences or contact energies related to these interaction patterns are derived from the structural database and (iii) from the evaluation of individual structure elements, hypotheses on the alignment of a new sequence to a given structure may be derived. Results are demonstrated in particular to examples of the blue copper proteins.

Assuntos

Dobramento de Proteína , Proteínas/química , Azurina/análogos & derivados , Azurina/química , Sítios de Ligação , Humanos , Matemática , Modelos Químicos , Estrutura Molecular , Muramidase/química , Estrutura Secundária de Proteína , Ribonuclease Pancreático/química , Termodinâmica

11.

Analysis of protein sheet topologies by graph theoretical methods.

Koch, I; Kaden, F; Selbig, J.

Proteins ; 12(4): 314-23, 1992 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-1579565

RESUMO

In order to find rules for the secondary structure prediction of proteins which describe the (sequentially) long-range interactions in sheet structures methods of applied graph theory were used. The so called beta graph which describes the sheet topology was defined for every protein in the Brookhaven Data Bank containing beta sheets. The resemblance of proteins at that topological level is discussed, and four notations and graphic representations of sheets which describe the sequential and topological neighborhoods of the strands were derived. This description level supports the usage of data structures which allow the implementation of efficient algorithms for the analysis and comparison of beta structures in proteins. A computer program for the representation and retrieval of bibliographic data and beta sheet structures was implemented. Some examples for substructure search illustrate the usefulness of the program. Two graphic catalogues were compiled: one contains all beta graphs of PDB proteins and the other all occurring different greek key descriptions.

Assuntos

Simulação por Computador , Conformação Proteica , Bases de Dados Bibliográficas , Armazenamento e Recuperação da Informação , Modelos Moleculares

12.

Applying machine learning methods for finding significant amino acid properties in proteins.

Selbig, J; Kaden, F; Koch, I.

FEBS Lett ; 297(3): 241-6, 1992 Feb 10.

Artigo em Inglês | MEDLINE | ID: mdl-1544403

RESUMO

There are several possibilities for definition and derivation of sequence patterns associated with structural motifs, in particular on the secondary structure level which may be used to predict these structure elements. Sequence patterns consist of a number of consecutive positions along the polypeptide chain from which a certain quantity is specified. One of the important factors in deriving sequence patterns in terms of amino acid properties is how to find the most characteristic properties to specify a certain position and thus to avoid redundant physical information. We have applied machine learning methods to select the most significant amino acid properties describing a structurally determined sequence position. Results are given for the beginning of alpha-helices. These methods may link the gap between amino acid patterns and property patterns and thus are valuable to improve protein structure prediction.

Assuntos

Aminoácidos/química , Simulação por Computador , Proteínas/química

13.

Knowledge-based prediction of protein structures.

Kaden, F; Koch, I; Selbig, J.

J Theor Biol ; 147(1): 85-100, 1990 Nov 07.

Artigo em Inglês | MEDLINE | ID: mdl-2277506

RESUMO

We propose a knowledge-based approach to the prediction of protein structures in cases where there is no sequence-homology to proteins with known spatial structure. Using methods from Artificial Intelligence we attempt to take into account long-range interactions within the prediction process. This allows not only the assignment of secondary but also of supersecondary structure elements. In particular, the patterns used as conditions of prediction rules are generated by learning methods from information contained in the Protein Data Base. Patterns on higher levels of the protein structure hierarchy are used as constraints to reduce the combinatorial search space. These patterns may also be used to search for specified structure motifs by interactive retrieval.

Assuntos

Inteligência Artificial , Conformação Proteica , Proteínas/química

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA