It is good to separate out hard zeros for interpretability. Thisis because sparsity causes a dirac delta in density at zero.
But Is sparsity actually a good proxy for interpretability? is still open question
Linear model or tree-based model are considered to have intrinsic interpretability
regularization forest
VAE