torch.amp.GradScaler()

Creator

Creator

Seonglae Cho

Created

Created

2023 Aug 20 15:16

Editor

Editor

Seonglae Cho

Edited

Edited

2024 Mar 9 6:28

Refs

Refs

Gradient Scaling

Mixed Precision

unscale is automated

update() update scaling factor

FP overflow나 underflow 고려해서 적당하게 유지한다.

unscale_ step 내부에는 포함되어 있지만
Gradient Clipping 같이 부가적인 작업 필요하면 미리 호출해야한다.

scaling하다보면 overflow 확률이 올라가니 Gradient clipping해주는 것

[Pytorch] apex / amp 모델 학습 빠르게 시키는 법.

github에서 pytorch 코드를 살펴보다 보면, apex, amp가 사용되는 모습을 자주 볼 수 있다. amp는 Automatic Mixed Precision의 약자로, 몇 operations들에서 float16 데이터타입을 사용해 학 속도를 향상시켜주는 방법을 제공해준다. 기존 pytorch는 데이터타입이 float32로 기본 설정이라는 것도 참고하면 좋을 것 같다. 또한 구글링을 하다보면 amp를 사용하는 방법이 apex.amp 와 torch.cuda.amp 두 방법이 나오는데, 아래 포스트를 보면 apex의 implementation들이 Pytorch에서 지원을 시작해서, apex는 이제 앞으로는 사용되지 않을 것이라고 한다. https://discuss.pytorch.org/t/torch-c..

https://dbwp031.tistory.com/33

[Pytorch] apex / amp 모델 학습 빠르게 시키는 법.

Recommendations

/////////