Topic model

Creator
Creator
Seonglae Cho
Created
Created
2025 Mar 12 11:40
Editor
Edited
Edited
2025 Mar 12 12:47
Topic model uncovers hidden (latent) topical patterns or semantic structure. Quantitatively, It groups (or clusters) of words (terms, n-grams) that are somehow related. It is often defined as a probabilistic structure (e.g. word cluster) expressing a certain set of assumptions about how the documents in our collection were generated.
Topic modeling methods
 
 

Pointwise mutual information

Words that occur in similar contexts (co-occur) tend to have similar meanings
  • Numerator: How often we have seen these words together
  • Denominator: How often we expect the words to co-occur, assuming they are independent
  • PMI: how much more two words, co-occur than expected by chance
 
 
 
 
 
 

Recommendations