[Home]ZipfsLaw

MeatballWiki | RecentChanges | Random Page | Indices | Categories

Zipf's Law — an empirical observation, not a theoretical law — states that the frequency of a word is roughly inversely proportional to its rank in the frequency table. For instance, one statistics-gathering exercise found that "the" is the most frequently-occurring word, and all by itself accounts for nearly 7% of all word occurrences (69971 out of slightly over 1 million). The second-place word "of" accounts for slightly over 3.5% of words (36411 occurrences) — half as frequent — followed by "and" (28852) — roughly a third of the frequency. Zipf's Law is a PowerLaw.

Zipf's Law also appears to apply in other situations, such as the value of each member accessible to you via a network. If this holds in a given case, then the value of the network is 1/1 + 1/2 + … + 1/(n-1), or O(log n), leading to the conclusion that the value of an entire network grows as O(n log n). See [Metcalfe's Law is Wrong] (by Briscoe, Odlyzko and Tilly) for an extended version of this argument.

References

[1] Lada A. Adamic. [Zipf, Power-laws, and Pareto - a ranking tutorial].

Links


Do other PowerLaws also give O(n log n) as the value of a network? I can't remember my series summation facts. -- ChrisPurcell


Discussion

MeatballWiki | RecentChanges | Random Page | Indices | Categories
Edit text of this page | View other revisions
Search: