PPL
Important metric for language modeling is validation perplexity, which is a representative of upstream quality. However, since it does not guarantee the performance of the downstream task, it should be checked separately. In other words, a low PPL value means high probability on data, but it does not necessarily mean a good language model.
Perplexity is defined as the exponentiated average negative log-likelihood of a sequence. If we have a tokenized sequence .
Calculating Perplexity for predetermined input text is more common. When measuring the context length limit to determine how much tolerance a model has for context length, it is assessed during generation.
Property
- When perplexity is high, the model tends to have flat attention scores rather than focusing on specific tokens, while with low perplexity, it shows sharp attention patterns focused on relevant tokens
Perplexity Notion
Bullshit Receptivity Scale, CBRS(Corporate Bullshit Receptivity Scale)
Lack of Critical Thinking (Reflective Thinking) or low Intellect with too much Openness makes one vulnerable to bullshit and fake news. Be careful about profound phrases such as for self-development Aphorism with lots of abstract concepts. Prefer Straightforward expression.
BSR Test