Identifies latent thought structures across multiple LLMs and enables latent-level communication
Uses a single shared Sparse Autoencoder (Jacobian SAE) to simultaneously encode and decode all agent hidden states.
Each agent selectively reads and shares only its relevant latent subspace during inference.
In gradient-level MoE, the Jacobian's zero/non-zero pattern effectively acts as a 'routing mask'.
Jacobian mask
Only a subset of SAE latent dimensions are connected to specific agents. This connection relationship is represented by the Jacobian mask, which is automatically formed during training through Jacobian sparsity regularization.
Since the mask varies with input, it's like MoE Routing, but it's essentially static as a quasi-static routing map. Unlike MoE, an advantage is that it can work across different architecture families.
SAE Model Transferability
However, since SAE is already challenging for a single model, it's questionable how well representation matching will work when sharing across multiple models. It would likely require a lot of data.
- Use hidden states after layer normalization + pooling
- Unify autoencoder input dimensions (same embedding dim)
- Jacobian sparsity forces alignment as a side-effect
ThoughtComm's SAE is likely not capturing a fully disentangled representation, but rather only capturing something like "low-rank correlated directions"

Seonglae Cho