Gradient Norm

Creator
Creator
Seonglae Cho
Created
Created
2025 May 20 16:19
Editor
Edited
Edited
2025 May 20 16:20
Refs
Refs
Tracking gradient norms during training helps identify
Vanishing Gradient
or
Exploding gradient
, which can destabilize training. By monitoring the gradient norms of each parameter, you ensure proper gradient flow.
After computing the loss and calling loss.backward(), each parameter's gradient is accessible via param.grad. The .norm() function computes the Euclidean norm of the gradient tensor, providing a scalar value representing the magnitude of gradients for that parameter.
for epoch in range(num_epochs): for images, labels in train_loader: optimizer.zero_grad() outputs = model(images) loss = criterion(outputs, labels) loss.backward() for name, param in model.named_parameters(): if param.grad is not None: print(f"Gradient norm for {name}: {param.grad.norm()}") optimizer.step()
 
 
 
 
 
 

Recommendations