SonicMoE

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2026 Jan 9 15:9
Editor
Edited
Edited
2026 Jan 9 15:11
Refs
Refs
Redesigns the computation graph to minimize activation storage during the backward pass, significantly reducing memory usage. Aligns the number of tokens assigned to experts with GPU GEMM tile sizes to eliminate unnecessary padding operations and improve processing speed. When training a 7B MoE model, throughput increases by 1.86× compared to ScatterMoE, and token rounding provides an additional ~16% improvement.
 
 
 
 
 
 
 

Recommendations