Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Biotechnol Biofuels ; 9: 252, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27895706

RESUMO

BACKGROUND: Trichoderma reesei is one of the main sources of biomass-hydrolyzing enzymes for the biotechnology industry. There is a need for improving its enzyme production efficiency. The use of metabolic modeling for the simulation and prediction of this organism's metabolism is potentially a valuable tool for improving its capabilities. An accurate metabolic model is needed to perform metabolic modeling analysis. RESULTS: A whole-genome metabolic model of T. reesei has been reconstructed together with metabolic models of 55 related species using the metabolic model reconstruction algorithm CoReCo. The previously published CoReCo method has been improved to obtain better quality models. The main improvements are the creation of a unified database of reactions and compounds and the use of reaction directions as constraints in the gap-filling step of the algorithm. In addition, the biomass composition of T. reesei has been measured experimentally to build and include a specific biomass equation in the model. CONCLUSIONS: The improvements presented in this work on the CoReCo pipeline for metabolic model reconstruction resulted in higher-quality metabolic models compared with previous versions. A metabolic model of T. reesei has been created and is publicly available in the BIOMODELS database. The model contains a biomass equation, reaction boundaries and uptake/export reactions which make it ready for simulation. To validate the model, we dem1onstrate that the model is able to predict biomass production accurately and no stoichiometrically infeasible yields are detected. The new T. reesei model is ready to be used for simulations of protein production processes.

2.
PLoS One ; 11(7): e0159302, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27441920

RESUMO

In this paper we apply machine learning methods for predicting protein interactions in fungal secretion pathways. We assume an inter-species transfer setting, where training data is obtained from a single species and the objective is to predict protein interactions in other, related species. In our methodology, we combine several state of the art machine learning approaches, namely, multiple kernel learning (MKL), pairwise kernels and kernelized structured output prediction in the supervised graph inference framework. For MKL, we apply recently proposed centered kernel alignment and p-norm path following approaches to integrate several feature sets describing the proteins, demonstrating improved performance. For graph inference, we apply input-output kernel regression (IOKR) in supervised and semi-supervised modes as well as output kernel trees (OK3). In our experiments simulating increasing genetic distance, Input-Output Kernel Regression proved to be the most robust prediction approach. We also show that the MKL approaches improve the predictions compared to uniform combination of the kernels. We evaluate the methods on the task of predicting protein-protein-interactions in the secretion pathways in fungi, S.cerevisiae, baker's yeast, being the source, T. reesei being the target of the inter-species transfer learning. We identify completely novel candidate secretion proteins conserved in filamentous fungi. These proteins could contribute to their unique secretion capabilities.


Assuntos
Proteínas Fúngicas/metabolismo , Aprendizado de Máquina , Mapeamento de Interação de Proteínas , Saccharomyces cerevisiae/metabolismo , Via Secretória , Trichoderma/metabolismo , Algoritmos , Sequência de Aminoácidos , Bases de Dados de Proteínas , Evolução Molecular , Proteínas Fúngicas/química , Genoma Fúngico , Mapas de Interação de Proteínas , Curva ROC , Saccharomyces cerevisiae/genética
3.
Appl Microbiol Biotechnol ; 100(16): 7203-22, 2016 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-27183995

RESUMO

The genomes of hybrid organisms, such as lager yeast (Saccharomyces cerevisiae × Saccharomyces eubayanus), contain orthologous genes, the functionality and effect of which may differ depending on their origin and copy number. How the parental subgenomes in lager yeast contribute to important phenotypic traits such as fermentation performance, aroma production, and stress tolerance remains poorly understood. Here, three de novo lager yeast hybrids with different ploidy levels (allodiploid, allotriploid, and allotetraploid) were generated through hybridization techniques without genetic modification. The hybrids were characterized in fermentations of both high gravity wort (15 °P) and very high gravity wort (25 °P), which were monitored for aroma compound and sugar concentrations. The hybrid strains with higher DNA content performed better during fermentation and produced higher concentrations of flavor-active esters in both worts. The hybrid strains also outperformed both the parent strains. Genome sequencing revealed that several genes related to the formation of flavor-active esters (ATF1, ATF2¸ EHT1, EEB1, and BAT1) were present in higher copy numbers in the higher ploidy hybrid strains. A direct relationship between gene copy number and transcript level was also observed. The measured ester concentrations and transcript levels also suggest that the functionality of the S. cerevisiae- and S. eubayanus-derived gene products differs. The results contribute to our understanding of the complex molecular mechanisms that determine phenotypes in lager yeast hybrids and are expected to facilitate targeted strain development through interspecific hybridization.


Assuntos
Cerveja/microbiologia , Quimera/genética , Etanol/metabolismo , Fermentação/genética , Saccharomyces cerevisiae/genética , Quimera/crescimento & desenvolvimento , DNA Fúngico/genética , Ésteres/análise , Hibridização Genética , Compostos Orgânicos/análise , Ploidias , Reação em Cadeia da Polimerase , Polimorfismo de Fragmento de Restrição , Saccharomyces cerevisiae/classificação , Saccharomyces cerevisiae/metabolismo , Transcrição Gênica/genética
4.
Appl Microbiol Biotechnol ; 100(17): 7549-63, 2016 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-27102126

RESUMO

We describe here the identification and characterization of two novel enzymes belonging to the IlvD/EDD protein family, the D-xylonate dehydratase from Caulobacter crescentus, Cc XyDHT, (EC 4.2.1.82), and the L-arabonate dehydratase from Rhizobium leguminosarum bv. trifolii, Rl ArDHT (EC 4.2.1.25), that produce the corresponding 2-keto-3-deoxy-sugar acids. There is only a very limited amount of characterization data available on pentonate dehydratases, even though the enzymes from these oxidative pathways have potential applications with plant biomass pentose sugars. The two bacterial enzymes share 41 % amino acid sequence identity and were expressed and purified from Escherichia coli as homotetrameric proteins. Both dehydratases were shown to accept pentonate and hexonate sugar acids as their substrates and require Mg(2+) for their activity. Cc XyDHT displayed the highest activity on D-xylonate and D-gluconate, while Rl ArDHT functioned best on D-fuconate, L-arabonate and D-galactonate. The configuration of the OH groups at C2 and C3 position of the sugar acid were shown to be critical, and the C4 configuration also contributed substantially to the substrate recognition. The two enzymes were also shown to contain an iron-sulphur [Fe-S] cluster. Our phylogenetic analysis and mutagenesis studies demonstrated that the three conserved cysteine residues in the aldonic acid dehydratase group of IlvD/EDD family members, those of C60, C128 and C201 in Cc XyDHT, and of C59, C127 and C200 in Rl ArDHT, are needed for coordination of the [Fe-S] cluster. The iron-sulphur cluster was shown to be crucial for the catalytic activity (kcat) but not for the substrate binding (Km) of the two pentonate dehydratases.


Assuntos
Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Caulobacter crescentus/enzimologia , Hidroliases/genética , Hidroliases/metabolismo , Rhizobium leguminosarum/enzimologia , Sequência de Aminoácidos , Arabinose/metabolismo , Clonagem Molecular , Escherichia coli/genética , Escherichia coli/metabolismo , Gluconatos/metabolismo , Alinhamento de Sequência , Xilose/metabolismo
5.
Appl Microbiol Biotechnol ; 100(2): 969-85, 2016 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-26454869

RESUMO

Xylose is present with glucose in lignocellulosic streams available for valorisation to biochemicals. Saccharomyces cerevisiae has excellent characteristics as a host for the bioconversion, except that it strongly prefers glucose to xylose, and the co-consumption remains a challenge. Further, since xylose is not a natural substrate of S. cerevisiae, the regulatory response it induces in an engineered strain cannot be expected to have evolved for its utilisation. Xylose-induced effects on metabolism and gene expression during anaerobic growth of an engineered strain of S. cerevisiae on medium containing both glucose and xylose medium were quantified. The gene expression of S. cerevisiae with an XR-XDH pathway for xylose utilisation was analysed throughout the cultivation: at early cultivation times when mainly glucose was metabolised, at times when xylose was co-consumed in the presence of low glucose concentrations, and when glucose had been depleted and only xylose was being consumed. Cultivations on glucose as a sole carbon source were used as a control. Genome-scale dynamic flux balance analysis models were simulated to analyse the metabolic dynamics of S. cerevisiae. The simulations quantitatively estimated xylose-dependent flux dynamics and challenged the utilisation of the metabolic network. A relative increase in xylose utilisation was predicted to induce the bi-directionality of glycolytic flux and a redox challenge even at low glucose concentrations. Remarkably, xylose was observed to specifically delay the glucose-dependent repression of particular genes in mixed glucose-xylose cultures compared to glucose cultures. The delay occurred at a cultivation time when the metabolic flux activities were similar in the both cultures.


Assuntos
Dissacarídeos/metabolismo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Xilose/metabolismo , Anaerobiose , Biomassa , Meios de Cultura/química , Fermentação , Expressão Gênica , Engenharia Genética , Glucose/metabolismo , Lignina/química , Redes e Vias Metabólicas/genética , Análise em Microsséries , Saccharomyces cerevisiae/crescimento & desenvolvimento
6.
Appl Microbiol Biotechnol ; 99(22): 9439-47, 2015 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-26264136

RESUMO

An open reading frame CC1225 from the Caulobacter crescentus CB15 genome sequence belongs to the Gfo/Idh/MocA protein family and has 47 % amino acid sequence identity with the glucose-fructose oxidoreductase from Zymomonas mobilis (Zm GFOR). We expressed the ORF CC1225 in the yeast Saccharomyces cerevisiae and used a yeast strain expressing the gene coding for Zm GFOR as a reference. Cell extracts of strains overexpressing CC1225 (renamed as Cc aaor) showed some Zm GFOR type of activity, producing D-gluconate and D-sorbitol when a mixture of D-glucose and D-fructose was used as substrate. However, the activity in Cc aaor expressing strain was >100-fold lower compared to strains expressing Zm gfor. Interestingly, C. crescentus AAOR was clearly more efficient than the Zm GFOR in converting in vitro a single sugar substrate D-xylose (10 mM) to xylitol without an added cofactor, whereas this type of activity was very low with Zm GFOR. Furthermore, when cultured in the presence of D-xylose, the S. cerevisiae strain expressing Cc aaor produced nearly equal concentrations of D-xylonate and xylitol (12.5 g D-xylonate l(-1) and 11.5 g D-xylitol l(-1) from 26 g D-xylose l(-1)), whereas the control strain and strain expressing Zm gfor produced only D-xylitol (5 g l(-1)). Deletion of the gene encoding the major aldose reductase, Gre3p, did not affect xylitol production in the strain expressing Cc aaor, but decreased xylitol production in the strain expressing Zm gfor. In addition, expression of Cc aaor together with the D-xylonolactone lactonase encoding the gene xylC from C. crescentus slightly increased the final concentration and initial volumetric production rate of both D-xylonate and D-xylitol. These results suggest that C. crescentus AAOR is a novel type of oxidoreductase able to convert the single aldose substrate D-xylose to both its oxidized and reduced product.


Assuntos
Aldeído Redutase/isolamento & purificação , Aldeído Redutase/metabolismo , Saccharomyces cerevisiae/enzimologia , Saccharomyces cerevisiae/genética , Açúcares Ácidos/metabolismo , Xilitol/metabolismo , Xilose/metabolismo , Aldeído Redutase/genética , Caulobacter crescentus/enzimologia , Caulobacter crescentus/genética , Gluconatos/metabolismo , Glucose/metabolismo , Oxirredução , Oxirredutases/genética , Oxirredutases/metabolismo , Filogenia , Saccharomyces cerevisiae/metabolismo , Sorbitol/metabolismo , Zymomonas/enzimologia , Zymomonas/genética
7.
Metab Eng ; 31: 153-62, 2015 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-26275749

RESUMO

Isoprene is a naturally produced hydrocarbon emitted into the atmosphere by green plants. It is also a constituent of synthetic rubber and a potential biofuel. Microbial production of isoprene can become a sustainable alternative to the prevailing chemical production of isoprene from petroleum. In this work, sequence homology searches were conducted to find novel isoprene synthases. Candidate sequences were functionally expressed in Escherichia coli and the desired enzymes were identified based on an isoprene production assay. The activity of three enzymes was shown for the first time: expression of the candidate genes from Ipomoea batatas, Mangifera indica, and Elaeocarpus photiniifolius resulted in isoprene formation. The Ipomoea batatas isoprene synthase produced the highest amounts of isoprene in all experiments, exceeding the isoprene levels obtained by the previously known Populus alba and Pueraria montana isoprene synthases that were studied in parallel as controls.


Assuntos
Alquil e Aril Transferases/isolamento & purificação , Escherichia coli/genética , Alquil e Aril Transferases/química , Alquil e Aril Transferases/fisiologia , Sequência de Aminoácidos , Butadienos , Genoma Bacteriano , Hemiterpenos/biossíntese , Dados de Sequência Molecular , Pentanos , Homologia de Sequência
8.
BMC Biotechnol ; 14: 91, 2014 Oct 27.
Artigo em Inglês | MEDLINE | ID: mdl-25344685

RESUMO

BACKGROUND: Trichoderma reesei is known as a good producer of industrial proteins but has hitherto been less successful in the production of therapeutic proteins. In order to elucidate the bottlenecks of heterologous protein production, human α-galactosidase A (GLA) was chosen as a model therapeutic protein. Fusion partners were designed to compare the effects of secretion using a cellobiohydrolase I (CBHI) carrier and intracellular production using a gamma zein peptide from maize (ZERA) which accumulates inside the endoplasmic reticulum (ER). The two strategies were compared on the basis of expression levels, purification performance, enzymatic activity, bioreactor cultivations, and transcriptional profiling. RESULTS: Constructs were cloned into the cbh1 locus of the T. reesei strain Rut-C30. The secretion and intracellular strains produced 20 mg/l and 636 mg/l of GLA respectively. Purifications of secreted product were accomplished using Step-Tactin affinity columns and for intracellular product, a method was developed for gravity-based density separation and protein body solubilisation. The secreted protein had similar specific activity to that of the commercially available mammalian form. The intracellular version had 5-10-fold lower activity due to the enzymes incompatibility with alkaline pH. The secretion strain achieved 10% lower total biomass than either the parental or the intracellular strain. The patterns of gene induction for intracellular and parental strains were similar, whereas the secretion strain had a broader spectrum of gene expression level changes. Identification of the genes involved indicated strong secretion stress in the secretion strain and to a lesser extent also in intracellular production. Genes involved in the unfolded protein response (UPR) and ER-associated degradation were induced by GLA production, including; hac1, pdi1, prp1, cnx1, der1, and bap31. CONCLUSIONS: Active human α-galactosidase could most effectively be produced intracellularly in Trichoderma reesei at >0.5 g/l by avoidance of the extracellular environment, although purification was challenging due to specific activity losses. Strain analysis revealed that in addition to the issues with secreted proteases, the processes of secretion stress including UPR and ER degradation remain as bottlenecks for heterologous protein production. Genetic engineering to eliminate these bottlenecks is the logical path towards establishing a strain capable of producing sensitive heterologous proteins.


Assuntos
Engenharia de Proteínas/métodos , alfa-Galactosidase/genética , alfa-Galactosidase/metabolismo , Humanos , Sinais Direcionadores de Proteínas , Transporte Proteico , Via Secretória , Trichoderma/genética
9.
Appl Microbiol Biotechnol ; 98(23): 9653-65, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25236800

RESUMO

Four potential dehydrogenases identified through literature and bioinformatic searches were tested for L-arabonate production from L-arabinose in the yeast Saccharomyces cerevisiae. The most efficient enzyme, annotated as a D-galactose 1-dehydrogenase from the pea root nodule bacterium Rhizobium leguminosarum bv. trifolii, was purified from S. cerevisiae as a homodimeric protein and characterised. We named the enzyme as a L-arabinose/D-galactose 1-dehydrogenase (EC 1.1.1.-), Rl AraDH. It belongs to the Gfo/Idh/MocA protein family, prefers NADP(+) but uses also NAD(+) as a cofactor, and showed highest catalytic efficiency (k cat/K m) towards L-arabinose, D-galactose and D-fucose. Based on nuclear magnetic resonance (NMR) and modelling studies, the enzyme prefers the α-pyranose form of L-arabinose, and the stable oxidation product detected is L-arabino-1,4-lactone which can, however, open slowly at neutral pH to a linear L-arabonate form. The pH optimum for the enzyme was pH 9, but use of a yeast-in-vivo-like buffer at pH 6.8 indicated that good catalytic efficiency could still be expected in vivo. Expression of the Rl AraDH dehydrogenase in S. cerevisiae, together with the galactose permease Gal2 for L-arabinose uptake, resulted in production of 18 g of L-arabonate per litre, at a rate of 248 mg of L-arabonate per litre per hour, with 86 % of the provided L-arabinose converted to L-arabonate. Expression of a lactonase-encoding gene from Caulobacter crescentus was not necessary for L-arabonate production in yeast.


Assuntos
Arabinose/metabolismo , Galactose Desidrogenases/metabolismo , Rhizobium leguminosarum/enzimologia , Saccharomyces cerevisiae/metabolismo , Açúcares Ácidos/metabolismo , Clonagem Molecular , Coenzimas/metabolismo , Estabilidade Enzimática , Galactose Desidrogenases/química , Galactose Desidrogenases/genética , Galactose Desidrogenases/isolamento & purificação , Expressão Gênica , Concentração de Íons de Hidrogênio , Cinética , Dados de Sequência Molecular , NAD/metabolismo , NADP/metabolismo , Proteínas Recombinantes/química , Proteínas Recombinantes/genética , Proteínas Recombinantes/isolamento & purificação , Proteínas Recombinantes/metabolismo , Rhizobium leguminosarum/metabolismo , Saccharomyces cerevisiae/enzimologia , Saccharomyces cerevisiae/genética , Análise de Sequência de DNA
10.
BMC Genomics ; 15: 763, 2014 Sep 05.
Artigo em Inglês | MEDLINE | ID: mdl-25192596

RESUMO

BACKGROUND: Production of D-xylonate by the yeast S. cerevisiae provides an example of bioprocess development for sustainable production of value-added chemicals from cheap raw materials or side streams. Production of D-xylonate may lead to considerable intracellular accumulation of D-xylonate and to loss of viability during the production process. In order to understand the physiological responses associated with D-xylonate production, we performed transcriptome analyses during D-xylonate production by a robust recombinant strain of S. cerevisiae which produces up to 50 g/L D-xylonate. RESULTS: Comparison of the transcriptomes of the D-xylonate producing and the control strain showed considerably higher expression of the genes controlled by the cell wall integrity (CWI) pathway and of some genes previously identified as up-regulated in response to other organic acids in the D-xylonate producing strain. Increased phosphorylation of Slt2 kinase in the D-xylonate producing strain also indicated that D-xylonate production caused stress to the cell wall. Surprisingly, genes encoding proteins involved in translation, ribosome structure and RNA metabolism, processes which are commonly down-regulated under conditions causing cellular stress, were up-regulated during D-xylonate production, compared to the control. The overall transcriptional responses were, therefore, very dissimilar to those previously reported as being associated with stress, including stress induced by organic acid treatment or production. Quantitative PCR analyses of selected genes supported the observations made in the transcriptomic analysis. In addition, consumption of ethanol was slower and the level of trehalose was lower in the D-xylonate producing strain, compared to the control. CONCLUSIONS: The production of organic acids has a major impact on the physiology of yeast cells, but the transcriptional responses to presence or production of different acids differs considerably, being much more diverse than responses to other stresses. D-Xylonate production apparently imposed considerable stress on the cell wall. Transcriptional data also indicated that activation of the PKA pathway occurred during D-xylonate production, leaving cells unable to adapt normally to stationary phase. This, together with intracellular acidification, probably contributes to cell death.


Assuntos
Parede Celular/metabolismo , Perfilação da Expressão Gênica/métodos , Proteínas de Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/fisiologia , Açúcares Ácidos/metabolismo , Regulação Fúngica da Expressão Gênica , Sistema de Sinalização das MAP Quinases , Proteínas Quinases Ativadas por Mitógeno/genética , Proteínas Quinases Ativadas por Mitógeno/metabolismo , Dados de Sequência Molecular , Fosforilação , Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo , Análise de Sequência de RNA , Estresse Fisiológico , Xilose/metabolismo
11.
PLoS Comput Biol ; 10(2): e1003465, 2014 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-24516375

RESUMO

We introduce a novel computational approach, CoReCo, for comparative metabolic reconstruction and provide genome-scale metabolic network models for 49 important fungal species. Leveraging on the exponential growth in sequenced genome availability, our method reconstructs genome-scale gapless metabolic networks simultaneously for a large number of species by integrating sequence data in a probabilistic framework. High reconstruction accuracy is demonstrated by comparisons to the well-curated Saccharomyces cerevisiae consensus model and large-scale knock-out experiments. Our comparative approach is particularly useful in scenarios where the quality of available sequence data is lacking, and when reconstructing evolutionary distant species. Moreover, the reconstructed networks are fully carbon mapped, allowing their use in 13C flux analysis. We demonstrate the functionality and usability of the reconstructed fungal models with computational steady-state biomass production experiment, as these fungi include some of the most important production organisms in industrial biotechnology. In contrast to many existing reconstruction techniques, only minimal manual effort is required before the reconstructed models are usable in flux balance experiments. CoReCo is available at http://esaskar.github.io/CoReCo/.


Assuntos
Fungos/genética , Fungos/metabolismo , Genoma Fúngico , Redes e Vias Metabólicas , Algoritmos , Biomassa , Biotecnologia , Biologia Computacional , Evolução Molecular , Fungos/classificação , Técnicas de Inativação de Genes , Microbiologia Industrial , Redes e Vias Metabólicas/genética , Modelos Biológicos , Modelos Genéticos , Modelos Estatísticos , Filogenia , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/crescimento & desenvolvimento , Saccharomyces cerevisiae/metabolismo , Especificidade da Espécie
12.
BMC Syst Biol ; 8: 16, 2014 Feb 14.
Artigo em Inglês | MEDLINE | ID: mdl-24528924

RESUMO

BACKGROUND: Saccharomyces cerevisiae is able to adapt to a wide range of external oxygen conditions. Previously, oxygen-dependent phenotypes have been studied individually at the transcriptional, metabolite, and flux level. However, the regulation of cell phenotype occurs across the different levels of cell function. Integrative analysis of data from multiple levels of cell function in the context of a network of several known biochemical interaction types could enable identification of active regulatory paths not limited to a single level of cell function. RESULTS: The graph theoretical method called Enriched Molecular Path detection (EMPath) was extended to enable integrative utilization of transcription and flux data. The utility of the method was demonstrated by detecting paths associated with phenotype differences of S. cerevisiae under three different conditions of oxygen provision: 20.9%, 2.8% and 0.5%. The detection of molecular paths was performed in an integrated genome-scale metabolic and protein-protein interaction network. CONCLUSIONS: The molecular paths associated with the phenotype differences of S. cerevisiae under conditions of different oxygen provisions revealed paths of molecular interactions that could potentially mediate information transfer between processes that respond to the particular oxygen availabilities.


Assuntos
Biologia Computacional/métodos , Fenótipo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Transcrição Gênica , Ciclo Celular , Regulação para Baixo , Fermentação , Regulação Fúngica da Expressão Gênica , Oxigênio , Saccharomyces cerevisiae/citologia
13.
Microb Cell Fact ; 11: 134, 2012 Oct 04.
Artigo em Inglês | MEDLINE | ID: mdl-23035824

RESUMO

BACKGROUND: Trichoderma reesei is a soft rot Ascomycota fungus utilised for industrial production of secreted enzymes, especially lignocellulose degrading enzymes. About 30 carbohydrate active enzymes (CAZymes) of T. reesei have been biochemically characterised. Genome sequencing has revealed a large number of novel candidates for CAZymes, thus increasing the potential for identification of enzymes with novel activities and properties. Plenty of data exists on the carbon source dependent regulation of the characterised hydrolytic genes. However, information on the expression of the novel CAZyme genes, especially on complex biomass material, is very limited. RESULTS: In this study, the CAZyme gene content of the T. reesei genome was updated and the annotations of the genes refined using both computational and manual approaches. Phylogenetic analysis was done to assist the annotation and to identify functionally diversified CAZymes. The analyses identified 201 glycoside hydrolase genes, 22 carbohydrate esterase genes and five polysaccharide lyase genes. Updated or novel functional predictions were assigned to 44 genes, and the phylogenetic analysis indicated further functional diversification within enzyme families or groups of enzymes. GH3 ß-glucosidases, GH27 α-galactosidases and GH18 chitinases were especially functionally diverse. The expression of the lignocellulose degrading enzyme system of T. reesei was studied by cultivating the fungus in the presence of different inducing substrates and by subjecting the cultures to transcriptional profiling. The substrates included both defined and complex lignocellulose related materials, such as pretreated bagasse, wheat straw, spruce, xylan, Avicel cellulose and sophorose. The analysis revealed co-regulated groups of CAZyme genes, such as genes induced in all the conditions studied and also genes induced preferentially by a certain set of substrates. CONCLUSIONS: In this study, the CAZyme content of the T. reesei genome was updated, the discrepancies between the different genome versions and published literature were removed and the annotation of many of the genes was refined. Expression analysis of the genes gave information on the enzyme activities potentially induced by the presence of the different substrates. Comparison of the expression profiles of the CAZyme genes under the different conditions identified co-regulated groups of genes, suggesting common regulatory mechanisms for the gene groups.


Assuntos
Lignina/metabolismo , Trichoderma/genética , Biomassa , Celulases/classificação , Celulases/genética , Bases de Dados Factuais , Perfilação da Expressão Gênica , Genoma Fúngico , Glicosídeo Hidrolases/genética , Glicosídeo Hidrolases/metabolismo , Filogenia , Polissacarídeo-Liases/genética , Polissacarídeo-Liases/metabolismo , Especificidade por Substrato
14.
PLoS One ; 7(3): e32235, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22461885

RESUMO

A variety of functionally important protein properties, such as secondary structure, transmembrane topology and solvent accessibility, can be encoded as a labeling of amino acids. Indeed, the prediction of such properties from the primary amino acid sequence is one of the core projects of computational biology. Accordingly, a panoply of approaches have been developed for predicting such properties; however, most such approaches focus on solving a single task at a time. Motivated by recent, successful work in natural language processing, we propose to use multitask learning to train a single, joint model that exploits the dependencies among these various labeling tasks. We describe a deep neural network architecture that, given a protein sequence, outputs a host of predicted local properties, including secondary structure, solvent accessibility, transmembrane topology, signal peptides and DNA-binding residues. The network is trained jointly on all these tasks in a supervised fashion, augmented with a novel form of semi-supervised learning in which the model is trained to distinguish between local patterns from natural and synthetic protein sequences. The task-independent architecture of the network obviates the need for task-specific feature engineering. We demonstrate that, for all of the tasks that we considered, our approach leads to statistically significant improvements in performance, relative to a single task neural network approach, and that the resulting model achieves state-of-the-art performance.


Assuntos
Biologia Computacional/métodos , Redes Neurais de Computação , Estrutura Secundária de Proteína , Proteínas/química , Algoritmos , Sítios de Ligação , Proteínas de Membrana/química , Reprodutibilidade dos Testes
15.
BMC Genomics ; 11: 441, 2010 Jul 19.
Artigo em Inglês | MEDLINE | ID: mdl-20642838

RESUMO

BACKGROUND: Trichoderma reesei is the main industrial producer of cellulases and hemicellulases that are used to depolymerize biomass in a variety of biotechnical applications. Many of the production strains currently in use have been generated by classical mutagenesis. In this study we characterized genomic alterations in high-producing mutants of T. reesei by high-resolution array comparative genomic hybridization (aCGH). Our aim was to obtain genome-wide information which could be utilized for better understanding of the mechanisms underlying efficient cellulase production, and would enable targeted genetic engineering for improved production of proteins in general. RESULTS: We carried out an aCGH analysis of four high-producing strains (QM9123, QM9414, NG14 and Rut-C30) using the natural isolate QM6a as a reference. In QM9123 and QM9414 we detected a total of 44 previously undocumented mutation sites including deletions, chromosomal translocation breakpoints and single nucleotide mutations. In NG14 and Rut-C30 we detected 126 mutations of which 17 were new mutations not documented previously. Among these new mutations are the first chromosomal translocation breakpoints identified in NG14 and Rut-C30. We studied the effects of two deletions identified in Rut-C30 (a deletion of 85 kb in the scaffold 15 and a deletion in a gene encoding a transcription factor) on cellulase production by constructing knock-out strains in the QM6a background. Neither the 85 kb deletion nor the deletion of the transcription factor affected cellulase production. CONCLUSIONS: aCGH analysis identified dozens of mutations in each strain analyzed. The resolution was at the level of single nucleotide mutation. High-density aCGH is a powerful tool for genome-wide analysis of organisms with small genomes e.g. fungi, especially in studies where a large set of interesting strains is analyzed.


Assuntos
Celulase/biossíntese , Hibridização Genômica Comparativa/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Trichoderma/genética , Trichoderma/metabolismo , DNA Fúngico/genética , Genômica , Sondas de Oligonucleotídeos/genética , Polimorfismo de Nucleotídeo Único , Deleção de Sequência
16.
PLoS One ; 4(4): e5179, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-19365549

RESUMO

BACKGROUND: Retroviral LTRs, paired or single, influence the transcription of both retroviral and non-retroviral genomic sequences. Vertebrate genomes contain many thousand endogenous retroviruses (ERVs) and their LTRs. Single LTRs are difficult to detect from genomic sequences without recourse to repetitiveness or presence in a proviral structure. Understanding of LTR structure increases understanding of LTR function, and of functional genomics. Here we develop models of orthoretroviral LTRs useful for detection in genomes and for structural analysis. PRINCIPAL FINDINGS: Although mutated, ERV LTRs are more numerous and diverse than exogenous retroviral (XRV) LTRs. Hidden Markov models (HMMs), and alignments based on them, were created for HML- (human MMTV-like), general-beta-, gamma- and lentiretroviruslike LTRs, plus a general-vertebrate LTR model. Training sets were XRV LTRs and RepBase LTR consensuses. The HML HMM was most sensitive and detected 87% of the HML LTRs in human chromosome 19 at 96% specificity. By combining all HMMs with a low cutoff, for screening, 71% of all LTRs found by RepeatMasker in chromosome 19 were found. HMM consensus sequences had a conserved modular LTR structure. Target site duplications (TG-CA), TATA (occasionally absent), an AATAAA box and a T-rich region were prominent features. Most of the conservation was located in, or adjacent to, R and U5, with evidence for stem loops. Several of the long HML LTRs contained long ORFs inserted after the second A rich module. HMM consensus alignment allowed comparison of functional features like transcriptional start sites (sense and antisense) between XRVs and ERVs. CONCLUSION: The modular conserved and redundant orthoretroviral LTR structure with three A-rich regions is reminiscent of structurally relaxed Giardia promoters. The five HMMs provided a novel broad range, repeat-independent, ab initio LTR detection, with prospects for greater generalisation, and insight into LTR structure, which may aid development of LTR-targeted pharmaceuticals.


Assuntos
DNA Viral/genética , Retroviridae/genética , Sequências Repetidas Terminais , Algoritmos , Animais , Sequência de Bases , DNA Viral/química , Regulação Viral da Expressão Gênica , Genoma Humano , Genoma Viral , Humanos , Camundongos , Dados de Sequência Molecular , Conformação de Ácido Nucleico , Fases de Leitura Aberta , Gambás/genética , Sensibilidade e Especificidade
17.
BMC Bioinformatics ; 8 Suppl 2: S11, 2007 May 03.
Artigo em Inglês | MEDLINE | ID: mdl-17493249

RESUMO

BACKGROUND: Human endogenous retroviruses (HERVs) are surviving traces of ancient retrovirus infections and now reside within the human DNA. Recently HERV expression has been detected in both normal tissues and diseased patients. However, the activities (expression levels) of individual HERV sequences are mostly unknown. RESULTS: We introduce a generative mixture model, based on Hidden Markov Models, for estimating the activities of the individual HERV sequences from EST (expressed sequence tag) databases. We use the model to estimate the relative activities of 181 HERVs. We also empirically justify a faster heuristic method for HERV activity estimation and use it to estimate the activities of 2450 HERVs. The majority of the HERV activities were previously unknown. CONCLUSION: (i) Our methods estimate activity accurately based on experiments on simulated data. (ii) Our estimate on real data shows that 7% of the HERVs are active. The active ones are spread unevenly into HERV groups and relatively uniformly in terms of estimated age. HERVs with the retroviral env gene are more often active than HERVs without env. Few of the active HERVs have open reading frames for retroviral proteins.


Assuntos
Algoritmos , Mapeamento Cromossômico/métodos , Bases de Dados Genéticas , Evolução Molecular , Etiquetas de Sequências Expressas , Genoma Viral/genética , Retroviridae/genética , Ativação Viral/genética , Humanos , Cadeias de Markov , Retroviridae/classificação , Especificidade da Espécie
18.
Int J Neural Syst ; 15(3): 163-79, 2005 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-16013088

RESUMO

About 8 per cent of the human genome consists of human endogenous retroviral sequences (HERVs), which are remains from ancient infections. The HERVs may give rise to transcripts or affect the expression of human genes. The first step in understanding HERV function is to classify HERVs into families. In this work we study the relationships of existing HERV families and detect potentially new HERV families. A Median Self-Organizing Map (SOM), a SOM for non-vectorial data, is used to group and visualize a collection of 3661 HERVs. The SOM-based analysis is complemented with estimates of the reliability of the results. A novel trustworthiness visualization method is used to estimate which parts of the SOM visualization are reliable and which not. The reliability of extracted interesting HERV groups is verified by a bootstrap procedure suitable for SOM visualization-based analysis. The SOM detects a group of epsilonretroviral sequences and a group of ERV9, HERVW, and HUERSP3 sequences which suggests that ERV9 and HERVW sequences may have a common origin.


Assuntos
Inteligência Artificial , Mapeamento Cromossômico/métodos , DNA/genética , Retrovirus Endógenos/genética , Genoma Humano , Algoritmos , Humanos , Filogenia , Reprodutibilidade dos Testes
19.
BMC Bioinformatics ; 4: 48, 2003 Oct 13.
Artigo em Inglês | MEDLINE | ID: mdl-14552657

RESUMO

BACKGROUND: Conventionally, the first step in analyzing the large and high-dimensional data sets measured by microarrays is visual exploration. Dendrograms of hierarchical clustering, self-organizing maps (SOMs), and multidimensional scaling have been used to visualize similarity relationships of data samples. We address two central properties of the methods: (i) Are the visualizations trustworthy, i.e., if two samples are visualized to be similar, are they really similar? (ii) The metric. The measure of similarity determines the result; we propose using a new learning metrics principle to derive a metric from interrelationships among data sets. RESULTS: The trustworthiness of hierarchical clustering, multidimensional scaling, and the self-organizing map were compared in visualizing similarity relationships among gene expression profiles. The self-organizing map was the best except that hierarchical clustering was the most trustworthy for the most similar profiles. Trustworthiness can be further increased by treating separately those genes for which the visualization is least trustworthy. We then proceed to improve the metric. The distance measure between the expression profiles is adjusted to measure differences relevant to functional classes of the genes. The genes for which the new metric is the most different from the usual correlation metric are listed and visualized with one of the visualization methods, the self-organizing map, computed in the new metric. CONCLUSIONS: The conjecture from the methodological results is that the self-organizing map can be recommended to complement the usual hierarchical clustering for visualizing and exploring gene expression data. Discarding the least trustworthy samples and improving the metric still improves it.


Assuntos
Gráficos por Computador/normas , Perfilação da Expressão Gênica/normas , Análise de Sequência com Séries de Oligonucleotídeos/normas , Animais , Análise por Conglomerados , Gráficos por Computador/tendências , Perfilação da Expressão Gênica/métodos , Perfilação da Expressão Gênica/estatística & dados numéricos , Regulação da Expressão Gênica/genética , Regulação Fúngica da Expressão Gênica/genética , Humanos , Camundongos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Homologia de Sequência do Ácido Nucleico
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...