Modeling the probability that a document will be generated from a specific language model
Document texts are a sample from the language model
Missing words should not have zero probability of occurring. Smoothing is a technique for estimating probabilities for missing (or unseen) words
- Laplace smoothing (Laplace correction)