Lexian Distributions
- Describes how word frequencies are distributed within a language or text.
- Common forms include Zipf’s law and the power law, which relate a word’s frequency to its rank.
- Useful for tasks such as language learning (focusing on high-frequency words) and text analysis (identifying important words and comparing texts).
Definition
Section titled “Definition”Lexian distributions are a statistical concept that refers to the distribution of words in a given language or text.
Explanation
Section titled “Explanation”Lexian distributions characterize how often words occur and how that frequency relates to each word’s rank in a frequency table. By modeling the relationship between rank and frequency, these distributions help quantify which words are most common and how rapidly frequency declines as rank increases.
Examples
Section titled “Examples”Zipf’s law
Section titled “Zipf’s law”Zipf’s law states that a word’s frequency is inversely proportional to its rank in the frequency table. Under this rule, the most frequent word occurs twice as often as the second most frequent word, three times as often as the third most frequent word, and so on. For example, in English the word “the” is the most frequent word, occurring approximately 7% of the time, while the second most frequent word, “of,” occurs approximately 3.5% of the time.
Power law
Section titled “Power law”A power law describes word frequency as proportional to the word’s rank raised to a certain power. In this form, the frequency of a word decreases as its rank increases; the source states this decrease as exponential with rank. For instance, in English the frequency of “the” is approximately 7%, “of” approximately 3.5%, and “and” approximately 2.5%.
Use cases
Section titled “Use cases”- Language learning: focusing study on the most frequent words to prioritize communicative utility.
- Text analysis: identifying the most important words in a text and comparing similarity between texts.
Related terms
Section titled “Related terms”- Zipf’s law
- Power law