different models learn similar features and circuits when trained on similar tasks. Convergent learning (2016)arxiv.orghttps://arxiv.org/pdf/1511.07543.pdfA Toy Model of Universality (2023)arxiv.orghttps://arxiv.org/pdf/2302.03025.pdf