Word Discounting

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2025 Jan 29 12:48
Editor
Edited
Edited
2025 Feb 5 11:38
Refs
Refs

Document texts are a sample from the language model

Missing words should not have zero probability of occurring.
Smoothing
is a technique for estimating probabilities for missing (or unseen) words

Discounting Methods

Laplace smoothing gives too much weight to unseen terms
  • Lidstone correction
  • Absolute discounting

Interpolation Methods

discounting treats unseen words equally
  • Jelinek-Mercer Smoothing
    • Smoothing with Background probabilities
  • Dirichlet Smoothing
 
 
 
 

Recommendations