Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 1 de 1
Filtrar
Mais filtros











Base de dados
Intervalo de ano de publicação
1.
Phys Rev E Stat Nonlin Soft Matter Phys ; 70(4 Pt 1): 042901, 2004 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-15600443

RESUMO

Zipf's law asserts that in all natural languages the frequency of a word is inversely proportional to its rank. The significance, if any, of this result for language remains a mystery. Here we examine a null hypothesis for the distribution of word frequencies, a so-called discourse-triggered word choice model, which is based on the assumption that the more a word is used, the more likely it is to be used again. We argue that this model is equivalent to the neutral infinite-alleles model of population genetics and so the degeneracy of the different words composing a sample of text is given by the celebrated Ewens sampling formula [Theor. Pop. Biol. 3, 87 (1972)]], which we show to produce an exponential distribution of word frequencies.


Assuntos
Algoritmos , Inteligência Artificial , Modelos Estatísticos , Processamento de Linguagem Natural , Semântica , Terminologia como Assunto , Vocabulário Controlado , Simulação por Computador , Distribuições Estatísticas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA