Towards Multimodal Interpretability: Learning Sparse Interpretable Features in Vision Transformers — LessWrong
Executive Summary In this post I present my results from training a Sparse Autoencoder (SAE) on a CLIP Vision Transformer (ViT) using the ImageNet-1k…
https://www.lesswrong.com/posts/bCtbuWraqYTDtuARg/towards-multimodal-interpretability-learning-sparse-2