CMD
Circuit Model Distance (CMD) measures how similar the output distribution of a circuit is to the full model by calculating the area between the CPR curve and f = 1 (where 0 is optimal).
where are the low-level nodes not in the candidate circuit. CMD measures the proportion of test-time outputs that change when each non-circuit node is individually resample-ablated.
MIB (Mechanistic Interpretability Benchmark)
All sets consist of (original, n counterfactuals) pairs, which clearly create situations where "outputs should be the same/should be different."