STE
for non-differentiable function’s back propagation
In forward propagation, non-differentiable functions are used, but during backpropagation, their derivatives are approximated simply (often like an identity function) to bypass non-differentiable parts and update the weights of the entire model