forward f16 → backward f16 could be flush as 0by multiplying gradient scaling (scale factor) to loss, scaled loss computed for backward pass