Steer2Edit
Steer2Edit: From Activation Steering to Component-Level Editing
Steering methods influence Large Language Model behavior by identifying semantic directions in hidden representations, but are typically realized through inference-time activation interventions...
https://arxiv.org/abs/2602.09870

From Weights to Activations: Is Steering the Next Frontier of Adaptation?
Post-training adaptation of language models is commonly achieved through parameter updates or input-based methods such as fine-tuning, parameter-efficient adaptation, and prompting. In parallel, a...
https://arxiv.org/abs/2604.14090v1

Weight Arithmetics Steering
Steering Language Models with Weight Arithmetic
Providing high-quality feedback to Large Language Models (LLMs) on a diverse training distribution can be difficult and expensive, and providing feedback only on a narrow distribution can result...
https://arxiv.org/abs/2511.05408


Seonglae Cho