Shared Expert

Creator

Creator

Seonglae Cho

Created

Created

2025 Oct 19 23:40

Editor

Editor

Seonglae Cho

Edited

Edited

2025 Oct 19 23:41

Refs

Refs

DeepSeekMoE

Fixed expert

DeepSeekMoE: Towards Ultimate Expert Specialization in...

In the era of large language models, Mixture-of-Experts (MoE) is a promising architecture for managing computational costs when scaling up model parameters. However, conventional MoE architectures...

https://arxiv.org/abs/2401.06066v1

Recommendations

///////