Pixels Versus Priors

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2025 Jun 7 18:19
Editor
Edited
Edited
2025 Jun 7 18:22

PvP

The model tests how visual information and prior knowledge are handled through early decoding: observing a 'flip' phenomenon where predictions initially rely on prior knowledge but are later reversed by visual information in middle and late layers. Pixels Versus Priors controls whether the model relies more on visual input or prior knowledge by manipulating activation vectors through addition.
Fundamentally, this is not different from
CAA
and
ActAdd
, except that it applies these concepts to multimodal and knowledge conflict scenarios, along with its unique dataset.
 
 
 
 
 

Recommendations