Adversarial Example

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2024 Jan 14 9:43
Editor
Edited
Edited
2025 Aug 28 10:11
Refs
Refs
notion image
 
 

Adversarial Example
s works due to the
Superposition Hypothesis

This interference occurs because even a small stimulation of a specific feature can simultaneously disturb other features, allowing attackers to achieve significant effects with minimal perturbations.
For this, the paper proposes that
Adversarial Training
→ increased robustness → reduced superposition → increased interpretability, thus connecting robustness and interpretability
arxiv.org
influence humans too
Images altered to trick machine vision can influence humans too
In a series of experiments published in Nature Communications, we found evidence that human judgments are indeed systematically influenced by adversarial perturbations.
Images altered to trick machine vision can influence humans too
 

Recommendations