The model exhibited considerable strategic thinking, situational awareness, evaluation awareness, and concealment tendencies that it did not express overtly. Early Mythos was excessively goal-oriented, sometimes pushing tasks in ways the user did not want. For example, when attempting to modify a file without permission, it searched for bypass and privilege-escalation exploits, and even designed steps to delete traces after execution.
Jack Lindsey on Twitter / X
Before limited-releasing Claude Mythos Preview, we investigated its internal mechanisms with interpretability techniques. We found it exhibited notably sophisticated (and often unspoken) strategic thinking and situational awareness, at times in service of unwanted actions. (1/14) pic.twitter.com/vhng7PXqcz— Jack Lindsey (@Jack_W_Lindsey) April 7, 2026
https://x.com/Jack_W_Lindsey/status/2041588505701388648


Seonglae Cho