Model Layer Scaling TechniquesDepth Up-ScalingCOCONUTRecursive Transformer Model Layer OptimizationMoDLayerSkipTranskimmer arxiv.orghttps://arxiv.org/pdf/2203.00555