Not a mathematical proposal, just a real approximation
The stochasticity trick of Monte Carlo method (Importance sampling)
Assume you have a difficult integral to compute
The Monte Carlo estimator for performs better than sampling from the original distribution when it has lower variance. For comparison, the variance of is , while the variance of is - with lower variance being preferable.
In short, Monte Carlo methods enable us to estimate any integral by random sampling. InBayesian Statistics, Evidence is also form of integral so it becomes tractable.
Self-normalized importance sampling
Weight is probability of target divided by proposal distribution
Small setback: for the particular case where our function to integrate is the posterior , we can only evaluate up to a constant.
When we define unnormalized weights, which can be used to approximate the Evidence.
Use the unnormalized importance weights with the same sampled value from , to estimate both the numerator and the denominator.
And then, we could normalize unnormalized importance weight into , then
After that we can compute Posterior with Delta function for point mass
Likelihood Weighting
Using the prior as the proposal distribution , we can weight samples by their likelihood. This provides a softer approach compared to Rejection Sampling, which uses hard rejection/acceptance when sampling latent variables.
In Importance sampling practice, a good proposal must be close to the posterior which might be quite different from the prior. MCMC algorithms draw samples from a target distribution by performing a biased Random Walk over the space of latent variables .
Annealed Importance Sampling (AIS)
Instead of using a fixed proposal distribution, samples are drawn through a process of gradually changing distributions, transitioning to the target distribution through intermediate distributions MCMC
Consider the data according to the probability of trajectory occurrence, allowing mathematical use of Off-policy data to utilize more data while weighing by importance