Classifier Probe

Creator

Creator

Seonglae Cho

Created

Created

2026 Feb 13 12:51

Editor

Editor

Seonglae Cho

Edited

Edited

2026 Feb 13 12:52

Refs

Refs

The Confidence Manifold

The Confidence Manifold: Geometric Structure of Correctness...

When a language model asserts that "the capital of Australia is Sydney," does it know this is wrong? We characterize the geometry of correctness representations across 9 models from 5 architecture...

https://arxiv.org/abs/2602.08159

Recommendations

///////////