ScholScan

Creator

Seonglae Cho

Created

2026 Jan 22 14:46

Editor

Seonglae Cho

Edited

2026 Jan 22 14:47

Refs

This paper introduces a new benchmark demonstrating that LLMs have almost no ability to read entire papers and find errors, with experiments proving that current models nearly all fail at this task. Current LLMs have virtually no capability for full paper verification. RAG provides almost no help.

openreview.net

https://openreview.net/pdf?id=GDA1yB6yDP

Recommendations

///////