Challenges the conventional belief that there is an inevitable "performance ↔ interpretability tradeoff".
Non-shared MLP dimensions are more monosemantic compared to single model. Phase Change does not occur during training since suspectably AI Feature Dimensionality's drastic changes do not happen. Based on this, reinterpreting the definition of Expert Specialization means that instead of load balancing, it is about monosemantically representing specific features. Experts maintain high monosemanticity for the features they were initially assigned to.

Seonglae Cho