FlexOlMo

Creator

Creator

Seonglae Cho

Created

Created

2025 Jul 28 8:9

Editor

Editor

Seonglae Cho

Edited

Edited

2025 Nov 12 23:0

Refs

Refs

Training Experts to Coordinate

The model anchors on a single Public FFN that all experts share.

Without data sharing, each organization independently trains model modules (experts) using their own data

Experts are trained in pairs with Public FFN to enable coordination → so that when all Experts are combined later, they don't conflict with each other.

Router weight is initialized from router embeddings created by averaging sample document embeddings from each domain

Simply concatenating embeddings from multiple experts completes the MoE router, initialized with input domain similarity and fine-tuned minimally

results

Removing a specific expert module → completely eliminates that data's influence (opt-in/out for specific data at inference time)

Can be adjusted according to license, copyright, and access permission requirements

https://arxiv.org/pdf/2507.07024

allenai/FlexOlmo-7x7B-1T · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

https://huggingface.co/allenai/FlexOlmo-7x7B-1T

allenai/FlexOlmo-7x7B-1T · Hugging Face

allenai • Updated 2026 Feb 2 19:10

Introducing FlexOlmo: a new paradigm for language model training and data collaboration | Ai2

Explore how FlexOlmo enables collaborative language model training without sacrificing data privacy or control, introducing a new, flexible approach to building shared AI models.

https://allenai.org/blog/flexolmo

Introducing FlexOlmo: a new paradigm for language model training and data collaboration | Ai2

Backlinks

Recommendations

//////////