SDXL sdxl-turbo-interpretabilitygoodfire-ai • Updated 2025 Jul 2 20:15
sdxl-turbo-interpretability
goodfire-ai • Updated 2025 Jul 2 20:15
Combined SAE and NMF to transform the model's internal representations into human-understandable units, making the (black box) diffusion model transparently manipulatable. Hundreds of SAE features were grouped using NMF into several high-level units (factors), combining the SAE Feature Splitting through NMF. In the equation , where V is the original SAE activation strength matrix, each row of H represents a high-level factor, and the values in that row represent the weights of the corresponding SAE features.

NMF

demo
blog
umap
skull
breast
Lips
model sae