MoE steering

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2026 May 8 11:23
Editor
Edited
Edited
2026 May 28 15:59
Refs
Refs
 
 
 
 
 
 
 
arxiv.org

Steer-MoE

SteerMoE inserts a Mixture-of-Experts (MoE)-based steering module into each layer of an audio encoder, dynamically transforming audio representations into a space that an LLM can interpret. Concretely, at layer , a shared router produces gating scores , which are used to compute a steering adjustment as a weighted sum over expert vectors . For example:
The adjusted hidden state is then passed through a linear projection and fed to the LLM as a “soft prompt”. Because this operates directly in a continuous vector space and skips discrete audio tokenization, it aims to minimize information loss while remaining “plug-and-play” (it does not require any modification to the LLM architecture). Notably, the shared router manages experts across all layers, improving parameter efficiency while enabling context-dependent steering/alignment.
Steer-MoE: Efficient Audio-Language Alignment with a...
Aligning pretrained audio encoders and Large Language Models (LLMs) offers a promising, parameter-efficient path to building powerful multimodal agents. However, existing methods often require...
Steer-MoE: Efficient Audio-Language Alignment with a...
 
 

Recommendations