Model Fitting, Fitting Probability Distribution, Parametric Learning
Methods find the most likely parameter that explain the data and boil down to
if ,
statistical experiment be a sample … , of i.i.d.
random variables in some measurable space Ω, usually Ω ⊆ ℝ
hyperparameter , is data set
- While performing MLE estimation, we update the weights through back propagation to maximize the likelihood of the data, obtaining the optimal point estimation
- While performing MAP estimation, we update the weights through back propagation to maximize the posterior probability, obtaining the optimal point estimation
- While performing Bayesian inference, we update the weights through back propagation to calculate the posterior probability distribution, obtaining the optimal density estimation
MLE is intuitive, MAP is a generalized MLE with non-constant log-prior, and ERM is a generalized form with any loss function and regularization term.
Point Estimations
Parameter Estimation Notion