What could a solution to the alignment problem look like?
My currently favored approach to alignment research is to build a system that does alignment research better than us. But what would that system actually do? The obvious answer is "whatever we're doing right now." This is unsatisfactory because we're not actually trying to solve the whole alignment problem-we're just trying to build a better alignment researcher.
https://aligned.substack.com/p/alignment-solution