Clear Gradients in Optimizer
This is better than
zero_grad()
function call since it avoids zeroing the memory for each parameter and reduces unnecessary memory operations, leading to a more efficient backward pass. Use when you need to reduce memory overhead and speed up training without affecting gradient calculation.