Stochastic Gradient Descent

SGD

Stochastic gradient descent randomly chooses data batches to introduce noise, which is helpful for optimizing model robustness, as opposed to using the whole dataset at once, which is done by Naive Gradient Descent. SGD also refers to Vanilla Gradient Descent without momentum.

Alternative to

Batch gradient Descent, It updates more frequently than Batch Gradient Descent

model's parameters are updated after processing each training example

Very scalable so used in most model

update every time

Need to shuffle for each
Training Epoch

Stochastic
Variational Inference

KLD를 줄이는 쪽으로 파라메터를 업데이트

KLD 식이 미분 가능해야

변분추론(Variational Inference) · ratsgo's blog

https://ratsgo.github.io/generative%20model/2017/12/19/vi/

Stochastic Gradient Descent

SGD

Need to shuffle for each
Training Epoch

Backlinks

Recommendations

Stochastic Gradient Descent

SGD

Need to shuffle for each Training Epoch

Backlinks

Recommendations

Need to shuffle for each
Training Epoch