Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 2 de 2
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
PLoS Comput Biol ; 17(11): e1009449, 2021 11.
Artigo em Inglês | MEDLINE | ID: mdl-34780468

RESUMO

The cost of sequencing the genome is dropping at a much faster rate compared to assembling and finishing the genome. The use of lightly sampled genomes (genome-skims) could be transformative for genomic ecology, and results using k-mers have shown the advantage of this approach in identification and phylogenetic placement of eukaryotic species. Here, we revisit the basic question of estimating genomic parameters such as genome length, coverage, and repeat structure, focusing specifically on estimating the k-mer repeat spectrum. We show using a mix of theoretical and empirical analysis that there are fundamental limitations to estimating the k-mer spectra due to ill-conditioned systems, and that has implications for other genomic parameters. We get around this problem using a novel constrained optimization approach (Spline Linear Programming), where the constraints are learned empirically. On reads simulated at 1X coverage from 66 genomes, our method, REPeat SPECTra Estimation (RESPECT), had 2.2% error in length estimation compared to 27% error previously achieved. In shotgun sequenced read samples with contaminants, RESPECT length estimates had median error 4%, in contrast to other methods that had median error 80%. Together, the results suggest that low-pass genomic sequencing can yield reliable estimates of the length and repeat content of the genome. The RESPECT software will be publicly available at https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_shahab-2Dsarmashghi_RESPECT.git&d=DwIGAw&c=-35OiAkTchMrZOngvJPOeA&r=ZozViWvD1E8PorCkfwYKYQMVKFoEcqLFm4Tg49XnPcA&m=f-xS8GMHKckknkc7Xpp8FJYw_ltUwz5frOw1a5pJ81EpdTOK8xhbYmrN4ZxniM96&s=717o8hLR1JmHFpRPSWG6xdUQTikyUjicjkipjFsKG4w&e=.


Assuntos
Algoritmos , Genoma , Genômica/estatística & dados numéricos , Sequências Repetitivas de Ácido Nucleico , Software , Animais , Biologia Computacional , Simulação por Computador , Bases de Dados Genéticas/estatística & dados numéricos , Humanos , Invertebrados/classificação , Invertebrados/genética , Análise dos Mínimos Quadrados , Modelos Lineares , Mamíferos/classificação , Mamíferos/genética , Modelos Genéticos , Filogenia , Plantas/classificação , Plantas/genética , Vertebrados/classificação , Vertebrados/genética
2.
J Theor Biol ; 376: 82-90, 2015 Jul 07.
Artigo em Inglês | MEDLINE | ID: mdl-25863268

RESUMO

We study an atomic signaling game under stochastic evolutionary dynamics. There are a finite number of players who repeatedly update from a finite number of available languages/signaling strategies. Players imitate the most fit agents with high probability or mutate with low probability. We analyze the long-run distribution of states and show that, for sufficiently small mutation probability, its support is limited to efficient communication systems. We find that this behavior is insensitive to the particular choice of evolutionary dynamic, a property that is due to the game having a potential structure with a potential function corresponding to average fitness. Consequently, the model supports conclusions similar to those found in the literature on language competition. That is, we show that efficient languages eventually predominate the society while reproducing the empirical phenomenon of linguistic drift. The emergence of efficiency in the atomic case can be contrasted with results for non-atomic signaling games that establish the non-negligible possibility of convergence, under replicator dynamics, to states of unbounded efficiency loss.


Assuntos
Teoria dos Jogos , Modelos Biológicos , Transdução de Sinais
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...