VAEs
Sample from the mean and standard deviation to compute latent sample
VAEs are regularized autoencoders where the form of the regularizer is defined by the prior (ELBO). 쉽게말해 variational inference 에서 사용하는 normal distribution 같은 걸로만 표현하도록 하는 분포로 approximation 시키도록 강제하는 일종의 regularizer
Marginalization
Reparameterization trick
Instead of theta, use mean and variance as weights
Variational Inference Similarly, when a Deterministic node appears before a Stochastic node in a Computational Graph, we use the reparameterization trick to move the stochastic node to a leaf position, enabling Back Propagation.
Mini batch
Standard hyperparameters is larger than 100
with reparameterization on normal distribution ( is each dimension of latent space)
Mutual information
Entropy , means KL (additional information from Z to X) D means reconstruction error
Each point defines bounds on the mutual information
- Auto-Encoding limit (D = 0, R = H) - All structure is encoded in the latent variables.
- Therefore, the additional information to encode is the randomness. (R = H)
- Sufficiently powerful decoder will be able to perfectly decode the latent (D = 0)
- Auto-Decoding limit (D = H, R = 0)
- Posterior over latent variables is ignored. (R=0)
- Z contains none of X's structure, and the Decoder reconstructs X independently (D=H)
Neither are desirable, so between two point there are ideal point
Training Dynamics
Keep KL doesn’t collapse and stably converges with compare train and validation of reconstruction loss to identify under/overfitting