Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 2 de 2
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
BMC Bioinformatics ; 24(1): 399, 2023 Oct 26.
Artigo em Inglês | MEDLINE | ID: mdl-37884874

RESUMO

BACKGROUND: We consider two key problems in genomics involving multiple traits: multi-trait genome wide association studies (GWAS), where the goal is to detect genetic variants associated with the traits; and multi-trait genomic selection (GS), where the emphasis is on accurately predicting trait values. Multi-trait linear mixed models build on the linear mixed model to jointly model multiple traits. Existing estimation methods, however, are limited to the joint analysis of a small number of genotypes; in fact, most approaches consider one SNP at a time. Estimating multi-dimensional genetic and environment effects also results in considerable computational burden. Efficient approaches that incorporate regularization into multi-trait linear models (no random effects) have been recently proposed to identify genomic loci associated with multiple traits (Yu et al. in Multitask learning using task clustering with applications to predictive modeling and GWAS of plant varieties. arXiv:1710.01788 , 2017; Yu et al in Front Big Data 2:27, 2019), but these ignore population structure and familial relatedness (Yu et al in Nat Genet 38:203-208, 2006). RESULTS: This work addresses this gap by proposing a novel class of regularized multi-trait linear mixed models along with scalable approaches for estimation in the presence of high-dimensional genotypes and a large number of traits. We evaluate the effectiveness of the proposed methods using datasets in maize and sorghum diversity panels, and demonstrate benefits in both achieving high prediction accuracy in GS and in identifying relevant marker-trait associations. CONCLUSIONS: The proposed regularized multivariate linear mixed models are relevant for both GWAS and GS. We hope that they will facilitate agronomy-related research in plant biology and crop breeding endeavors.


Assuntos
Estudo de Associação Genômica Ampla , Melhoramento Vegetal , Estudo de Associação Genômica Ampla/métodos , Modelos Lineares , Fenótipo , Genômica/métodos , Produtos Agrícolas , Polimorfismo de Nucleotídeo Único , Modelos Genéticos
2.
Nat Commun ; 12(1): 5743, 2021 09 30.
Artigo em Inglês | MEDLINE | ID: mdl-34593817

RESUMO

Machine learning has been increasingly used for protein engineering. However, because the general sequence contexts they capture are not specific to the protein being engineered, the accuracy of existing machine learning algorithms is rather limited. Here, we report ECNet (evolutionary context-integrated neural network), a deep-learning algorithm that exploits evolutionary contexts to predict functional fitness for protein engineering. This algorithm integrates local evolutionary context from homologous sequences that explicitly model residue-residue epistasis for the protein of interest with the global evolutionary context that encodes rich semantic and structural features from the enormous protein sequence universe. As such, it enables accurate mapping from sequence to function and provides generalization from low-order mutants to higher-order mutants. We show that ECNet predicts the sequence-function relationship more accurately as compared to existing machine learning algorithms by using ~50 deep mutational scanning and random mutagenesis datasets. Moreover, we used ECNet to guide the engineering of TEM-1 ß-lactamase and identified variants with improved ampicillin resistance with high success rates.


Assuntos
Aprendizado Profundo , Evolução Molecular , Engenharia de Proteínas/métodos , Sequência de Aminoácidos/genética , Conjuntos de Dados como Assunto , Aptidão Genética , Ensaios de Triagem em Larga Escala , Mutação , Homologia de Sequência de Aminoácidos , Resistência beta-Lactâmica/genética , beta-Lactamases/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...