CAST (Conditional Activation Steering) activation-steeringIBM • Updated 2026 Feb 15 19:35
activation-steering
IBM • Updated 2026 Feb 15 19:35
- Conditional SAE clamping
- Conditional SAE steering
- Constant SAE clamping
Conditional refusal steering
arxiv.org
https://arxiv.org/pdf/2409.05907
arxiv.org
https://arxiv.org/pdf/2411.11296v1
Sieve (2024.12)
for code generation specifically not using regex (very simple and naive task)
Sieve: SAEs Beat Baselines on a Real-World Task (A Code Generation Case Study) | Tilde
Our methods achieve Pareto dominance on the axis of task success rate vs task constraint satisfaction vs general model performance.
https://www.tilderesearch.com/blog/sieve

tilde-research/sieve_coding · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
https://huggingface.co/tilde-research/sieve_coding
Compare Alpaca Dataset / Sorry Bench
- AI Condition Vector (extract to prompt)
- Refusal vector (apply to response)


Seonglae Cho