Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
PLoS One ; 10(3): e0120729, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25826299

RESUMO

A variety of methods that predict human nonsynonymous single nucleotide polymorphisms (SNPs) to be neutral or disease-associated have been developed over the last decade. These methods are used for pinpointing disease-associated variants in the many variants obtained with next-generation sequencing technologies. The high performances of current sequence-based predictors indicate that sequence data contains valuable information about a variant being neutral or disease-associated. However, most predictors do not readily disclose this information, and so it remains unclear what sequence properties are most important. Here, we show how we can obtain insight into sequence characteristics of variants and their surroundings by interpreting predictors. We used an extensive range of features derived from the variant itself, its surrounding sequence, sequence conservation, and sequence annotation, and employed linear support vector machine classifiers to enable extracting feature importance from trained predictors. Our approach is useful for providing additional information about what features are most important for the predictions made. Furthermore, for large sets of known variants, it can provide insight into the mechanisms responsible for variants being disease-associated.


Assuntos
Doenças Genéticas Inatas/genética , Variação Genética , Sequência de Aminoácidos , Humanos , Dados de Sequência Molecular , Polimorfismo de Nucleotídeo Único , Homologia de Sequência de Aminoácidos
2.
Protein Eng Des Sel ; 27(9): 281-8, 2014 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-25082898

RESUMO

Protein redesign methods aim to improve a desired property by carefully selecting mutations in relevant regions guided by protein structure. However, often protein structural requirements underlying biological characteristics are not well understood. Here, we introduce a methodology that learns relevant mutations from a set of proteins that have the desired property and demonstrate it by successfully improving production levels of two enzymes by Aspergillus niger, a relevant host organism for industrial enzyme production. We validated our method on two enzymes, an esterase and an inulinase, creating four redesigns with 5-45 mutations. Up to 10-fold increase in production was obtained with preserved enzyme activity for small numbers of mutations, whereas production levels and activities dropped for too aggressive redesigns. Our results demonstrate the feasibility of protein redesign by learning. Such an approach has great potential for improving production levels of many industrial enzymes and could potentially be employed for other design goals.


Assuntos
Aspergillus niger/enzimologia , Evolução Molecular Direcionada/métodos , Esterases/síntese química , Proteínas Fúngicas/síntese química , Glicosídeo Hidrolases/síntese química , Sequência de Aminoácidos/genética , Aspergillus niger/genética , Clonagem Molecular/métodos , Esterases/genética , Proteínas Fúngicas/genética , Glicosídeo Hidrolases/genética , Estrutura Secundária de Proteína
3.
BMC Bioinformatics ; 15: 93, 2014 Mar 31.
Artigo em Inglês | MEDLINE | ID: mdl-24685258

RESUMO

BACKGROUND: Amino acid sequences and features extracted from such sequences have been used to predict many protein properties, such as subcellular localization or solubility, using classifier algorithms. Although software tools are available for both feature extraction and classifier construction, their application is not straightforward, requiring users to install various packages and to convert data into different formats. This lack of easily accessible software hampers quick, explorative use of sequence-based classification techniques by biologists. RESULTS: We have developed the web-based software tool SPiCE for exploring sequence-based features of proteins in predefined classes. It offers data upload/download, sequence-based feature calculation, data visualization and protein classifier construction and testing in a single integrated, interactive environment. To illustrate its use, two example datasets are included showing the identification of differences in amino acid composition between proteins yielding low and high production levels in fungi and low and high expression levels in yeast, respectively. CONCLUSIONS: SPiCE is an easy-to-use online tool for extracting and exploring sequence-based features of sets of proteins, allowing non-experts to apply advanced classification techniques. The tool is available at http://helix.ewi.tudelft.nl/spice.


Assuntos
Proteínas/química , Análise de Sequência de Proteína/métodos , Design de Software , Algoritmos , Sequência de Aminoácidos , Aspergillus niger , Internet , Dados de Sequência Molecular , Saccharomyces cerevisiae
4.
PLoS One ; 7(10): e45869, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23049690

RESUMO

Protein sequence features are explored in relation to the production of over-expressed extracellular proteins by fungi. Knowledge on features influencing protein production and secretion could be employed to improve enzyme production levels in industrial bioprocesses via protein engineering. A large set, over 600 homologous and nearly 2,000 heterologous fungal genes, were overexpressed in Aspergillus niger using a standardized expression cassette and scored for high versus no production. Subsequently, sequence-based machine learning techniques were applied for identifying relevant DNA and protein sequence features. The amino-acid composition of the protein sequence was found to be most predictive and interpretation revealed that, for both homologous and heterologous gene expression, the same features are important: tyrosine and asparagine composition was found to have a positive correlation with high-level production, whereas for unsuccessful production, contributions were found for methionine and lysine composition. The predictor is available online at http://bioinformatics.tudelft.nl/hipsec. Subsequent work aims at validating these findings by protein engineering as a method for increasing expression levels per gene copy.


Assuntos
Aspergillus niger/genética , Biologia Computacional/métodos , Enzimas/biossíntese , Proteínas Fúngicas/genética , Genes Fúngicos/genética , Microbiologia Industrial/métodos , Sequência de Aminoácidos , Inteligência Artificial , Aspergillus niger/enzimologia , Eletroforese em Gel de Poliacrilamida , Perfilação da Expressão Gênica , Engenharia Genética/métodos , Dados de Sequência Molecular
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...