Update weights & manage learning rateAn algorithm that adjusts the parameters of a model in order to minimize the difference between the predicted output and the actual output of the training dataModel OptimizersAdam OptimizerAdamW OptimizerAdaFactorSophia OptimizerAdagradRMSprop Model Optimizer NotionDistributed Optimizer VisualizationGradient descent visualization - hills5 gradient descent methods (gradient descent, momentum, adagrad, rmsprop & adam) racing down a terrain with two hills. Software: https://github.com/lilipads/gradient_descent_viz Blog Post: https://towardsdatascience.com/a-visual-explanation-of-gradient-descent-methods-momentum-adagrad-rmsprop-adam-f898b102325chttps://www.youtube.com/watch?v=ilYd4TAzNoUAdafactor Optimizer for Deep Learning메모리 사용량이 적으면서 learning rate도 알아서 찾아주는 Adafactor에 대해서 알아본다.https://heegyukim.medium.com/adafactor-optimizer-for-deep-learning-8268ca91e506[논문 리뷰] AdamW에 대해 알아보자! Decoupled weight decay regularization 논문 리뷰(1)재야의 숨은 고수가 되고 싶은 초심자https://hiddenbeginner.github.io/deeplearning/paperreview/2019/12/29/paper_review_AdamW.html