Alignment Science Blog
Alignment Science Blog
We are Anthropic's
Alignment Science team. We do machine
learning research on the problem of
steering and controlling future powerful AI systems, as well
as understanding and evaluating the risks that they pose.
Welcome to our blog!
https://alignment.anthropic.com/
Transformer Circuit Thread
Transformer Circuits Thread
Can we reverse engineer transformer language models into human-understandable computer programs?
https://transformer-circuits.pub/


Seonglae Cho