Power law that
Dataset exhibiting scale-invariance and self-similarity driven by power-law relationship
a few words occur very often, and many words hardly ever occur
Word frequency is inversely proportional to rank. For example, the most frequent word appears approximately twice as often as the second most frequent word
where is means frequency and indicates the rank in a set of variables controlled by parameter
Practice follows theory quite well, but not entirely.