Alignment Science BlogAlignment Science BlogWe are Anthropic's Alignment Science team. We do machine learning research on the problem of steering and controlling future powerful AI systems, as well as understanding and evaluating the risks that they pose. Welcome to our blog!https://alignment.anthropic.com/Transformer Circuit ThreadTransformer Circuits ThreadCan we reverse engineer transformer language models into human-understandable computer programs?https://transformer-circuits.pub/