LDA

Creator
Creator
Seonglae Cho
Created
Created
2023 Jun 11 8:58
Editor
Edited
Edited
2025 Mar 12 12:41

Latent Dirichlet Allocation

Motivation

LDA is an algorithm that improves upon
LSA
and is suitable for
Topic model
.
Let's assume we have a number of topics, each defined as distributions over words. A document is generated through the following process: First, we choose a distribution over the topics. Then, for each word position, we select a topic assignment and choose a word from that corresponding topic.

Method

  1. For each of the KK topics, draw a multinomial distribution βk\beta_k from a
    Dirichlet distribution
    with parameter η\eta which controls the mean shape and sparsity of β\beta
  1. For each of the DD documents, draw w a multinomial distribution θj\theta_j from a
    Dirichlet distribution
    with parameter α\alpha which controls the mean shape and sparsity of θ\theta
  1. For each word position DjiD_{ji} in a document DjD_j
    1. Select a latent topic zjiz_{ji} from the multinomial distribution θj\theta_j
    2. Choose the observation wjiw_{ji} from the multinomial distribution βzji\beta_{z_{ji}}
θ\theta and β\beta has VVparameters where VV is the size of the vocabulary across all DD documents.

Modeling

p(W,Θ,B,Zα,η)=k=1Kp(βkη)j=1Dp(θjα)(i=1Njp(zjiθj)p(wjiB,zji))p(\mathbf{W}, \boldsymbol{\Theta}, \mathbf{B}, \mathbf{Z} \mid \boldsymbol{\alpha},\boldsymbol{\eta}) =\prod_{k=1}^{K} p(\boldsymbol{\beta}_k \boldsymbol{\eta})\prod_{j=1}^{D} p(\boldsymbol{\theta}_j \mid \boldsymbol{\alpha})\left(\prod_{i=1}^{N_j} p(z_{ji} \mid \boldsymbol{\theta}_j) p(w_{ji} \mid \mathbf{B}, z_{ji})\right)
Posterior is impossible to compute so we approximate it
p(Θ,B,ZW,α,η)=p(Θ,B,Z,Wα,η)BΘZp(Θ,B,Z,Wα,η)p(\boldsymbol{\Theta}, \mathbf{B}, \mathbf{Z} \mid \mathbf{W}, \boldsymbol{\alpha}, \boldsymbol{\eta}) =\frac{p(\boldsymbol{\Theta}, \mathbf{B}, \mathbf{Z}, \mathbf{W} \mid \boldsymbol{\alpha}, \boldsymbol{\eta})}{\int_{\mathbf{B}} \int_{\boldsymbol{\Theta}} \sum_{\mathbf{Z}} p(\boldsymbol{\Theta}, \mathbf{B}, \mathbf{Z}, \mathbf{W} \mid \boldsymbol{\alpha}, \boldsymbol{\eta})}

Approximation using
Gibbs sampling

  1. Initialize probabilities randomly or uniformly
  1. In each step, replace the value of one of the variables by a value drawn from the distribution of that variable conditioned on the values of the remaining variables
  1. Repeat until convergence
Estimate the probability of assigning wjiw_{ji} to each topic, conditioned on the topic assignments (zj,iz_{j,-i}) of all other words wj,iw_{j, -i}(notation indicating the exclusion of wjiw_{ji})
p(zji=kzj,i,w,α,η)nj,k,i+αkk=1Knj,k,i+αkProbability that document j choose topic k P(kdj)mk,wji,i+ηwjiν=1Vmk,ν,i+ηνProbability that topic k generates word wji P(wjik)p(z_{ji} = k \mathbf{z}_{j,-i}, \mathbf{w}, \boldsymbol{\alpha}, \boldsymbol{\eta}) \\ \propto \underbrace{\frac{n_{j,k,-i} + \alpha_k}{\sum_{k'=1}^{K} n_{j,k',-i} + \alpha_{k'}}}_{\text{Probability that document j choose topic k } P(k|d_j)}\cdot \propto \underbrace{\frac{m_{k,w_{ji},-i} + \eta_{w_{ji}}}{\sum_{\nu=1}^{V} m_{k,\nu,-i} + \eta_{\nu}}}_{\text{Probability that topic k generates word } w_{ji} \text{ } P(w_{ji}|k)}
From the above conditional distribution, sample a topic and set it as the new topic assignment zjiz_{ji} of zjiz_{ji}

Comparison

notion image
notion image
 
 
 

Online example

 
 

Recommendations