cross machine multi gpu trainingoffloading parameters to CPUmixed precision traininggradient accumulation which enables training on a single GPU