MAP

Creator
Creator
Seonglae Cho
Created
Created
2023 Mar 23 1:42
Editor
Edited
Edited
2025 Apr 28 20:54
Refs

Maximum A Posteriori

Intuitively, MLE finds the theta that maximizes the probability of the data, while MAP finds the most probable model given the data. Since MAP includes the prior term when divided by Bayes denominator, it is considered a generalized form of MLE.
θ^MAP=arg maxθp(θD)=arg maxθlogp(θD)\hat{\theta}_{MAP} = \argmax_\theta p(\theta | \mathcal{D}) = \argmax_\theta \log p(\theta | \mathcal{D})
  • priori mean ‘from the earlier
  • posteriori means ‘from the later
finds the parameters θ~MAP\tilde{\theta}_{MAP} maximizing a posteriori distribution
assume θ\theta also has some distribution and find optimal θ\theta
We assume a zero-mean Gaussian prior with covariance Σ for parameters θ\theta
L(θ)=1ni=1nl(y,θ)+λC(θ)\mathcal{L}(\theta) = \frac{1}{n}\sum_{i=1}^n l(y, \theta) + \lambda C(\theta)
with λ0\lambda \ge0 called the regularization parameter and CC is a measure of complexity
If we use the
Log-likelihood function
, a common penalty is to use C(θ)=logp(θ)C(\theta) = -\log p(\theta) where p(θ)p(\theta) is the prior. By setting λ=1n\lambda = \frac{1}{n}, and ignoring the 1n\frac{1}{n} which does not depend on θ\theta.
L(θ)=i=1nlogp(Yiθ)+logp(θ)=(logp(Dθ)+logp(θ))=logp(θD)+logp(D)=logp(θD)L(\theta) = - \sum_{i=1}^n \log p(Y_i | \theta) + \log p(\theta) = - (\log p(D | \theta) + \log p(\theta)) = - \log p(\theta | D) + \log p(D) = \log p(\theta|\mathcal{D})
When we use a log form of
Bayes Theorem
, minimizing this is equivalent to maximizing the log posterior:
MAP
 
 
 
 
 
 

Recommendations