AI Auditing

Creator
Creator
Seonglae Cho
Created
Created
2025 Apr 27 18:1
Editor
Edited
Edited
2025 May 5 23:46

AI Monitoring

AI Auditing Methods
 
 
 
 
 
Internet Comment
Automatic bot detection and blocking cases

Prompts vs activation monitoring (OpenAI)

With small data, suffix-only prompted last-token probing is the most data-efficient, while with large data: SAE max-pooled probing outperforms raw probing and shows similar performance to prompted probing. SAE proving outperformed raw activation proving
 
 

Recommendations