FlexOlMo

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2025 Jul 28 8:9
Editor
Edited
Edited
2025 Nov 12 23:0
Refs

Training Experts to Coordinate

The model anchors on a single Public FFN that all experts share.
  • Without data sharing, each organization independently trains model modules (experts) using their own data
  • Experts are trained in pairs with Public FFN to enable coordination → so that when all Experts are combined later, they don't conflict with each other.
  • Router weight is initialized from router embeddings created by averaging sample document embeddings from each domain
  • Simply concatenating embeddings from multiple experts completes the MoE router, initialized with input domain similarity and fine-tuned minimally
results
  • Removing a specific expert module → completely eliminates that data's influence (opt-in/out for specific data at inference time)
  • Can be adjusted according to license, copyright, and access permission requirements
 
 
FlexOlmo
allenaiUpdated 2025 Nov 4 5:50
 
 

Recommendations