A public benchmark dataset containing around 80 'half-synthetic' transformers with well-understood internal circuits (algorithms), typically evaluated using ROC Curve arxiv.orghttps://arxiv.org/pdf/2407.14494