Model Optimizer

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2023 Jun 18 8:47
Editor
Edited
Edited
2025 Dec 18 16:16

Update weights & manage learning rate

An algorithm that adjusts the parameters of a model in order to minimize the difference between the predicted output and the actual output of the training data.
When applying a fixed learning rate, the model may oscillate or fail to converge. With constant gradients like the sign function in L1 norm, why adaptive learning rates are applied.
All
Neural Network
and
Model Optimizer
can be viewed as
Associative Memory
that compresses context flow.
Model Optimizers
 
 
Model Optimizer Notion
 
 
 
 

Visualization

Gradient descent visualization - hills
5 gradient descent methods (gradient descent, momentum, adagrad, rmsprop & adam) racing down a terrain with two hills. Software: https://github.com/lilipads/gradient_descent_viz Blog Post: https://towardsdatascience.com/a-visual-explanation-of-gradient-descent-methods-momentum-adagrad-rmsprop-adam-f898b102325c
Gradient descent visualization - hills
Adafactor Optimizer for Deep Learning
메모리 사용량이 적으면서 learning rate도 알아서 찾아주는 Adafactor에 대해서 알아본다.
Adafactor Optimizer for Deep Learning
[논문 리뷰] AdamW에 대해 알아보자! Decoupled weight decay regularization 논문 리뷰(1)
재야의 숨은 고수가 되고 싶은 초심자
[논문 리뷰] AdamW에 대해 알아보자! Decoupled weight decay regularization 논문 리뷰(1)
 
 

Recommendations