GMM Clustering

Creator
Creator
Seonglae Cho
Created
Created
2024 Oct 17 10:31
Editor
Edited
Edited
2025 Mar 25 10:59
Refs
Refs
GMM

We estimate covariance and mean vector of Gaussian Distributions

K-means Clustering
approximated when variance approximate to 0 and all same
Mixture models can be used to build complex distribution and to cluster data

Solution

We can get μ,Σ,γ\mu, \Sigma, \gamma by derivation of
GMM
’s Log Likelihood but they are independent so we need an algorithm such as
EM Algorithm
to get a solution.
  • E: fix the distributions’ parameters, compute responsibilities using current parameter values (probabilities of belonging to each cluster)
  • M: Re-estimate the parameters using the current responsibilities (fix the responsibilities)
Detail
  1. Initialize the distributions’ parameters and evaluate the initial value of the log likelihood
  1. For each data point, compute responsibilities using current parameter values
  1. Re-estimate the parameters using the current responsibilities
  1. Evaluate the log likelihood and check for convergence of either the parameters or the loglikelihood. If the convergence criterion not satisfied return to step 2

Pros

  • Soft assign points to clusters
  • Convergence is guaranteed but it may reach a locally optimal solution
 
 
 
 
 
 

Recommendations