Dynamic SAE GuardrailsIt's not dynamic but rather a type of Conditional Vector Steering GradientSAEs Can Improve Unlearning: Dynamic Sparse Autoencoder Guardrails...Machine unlearning is a promising approach to improve LLM safety by removing unwanted knowledge from the model. However, prevailing gradient-based unlearning methods suffer from issues such as high...https://openreview.net/forum?id=8gFO7ebDLTarxiv.orghttps://arxiv.org/pdf/2504.08192v1SAE DSG (Dynamic SAE guardrail)arxiv.orghttps://arxiv.org/pdf/2504.08192