Multimodal Interpretability Usages
Task Vectors are Cross-Modal
Task Vectors are Cross-Modal
Task representations in VLMs are consistent across modality (text, image) and specification (example, instruction).
https://task-vectors-are-cross-modal.github.io/
Multimodal Universal Attention Head

Seonglae Cho