arithmetic
Teaching Arithmetic to Small Transformers
Large language models like GPT-4 exhibit emergent capabilities across general-purpose tasks, such as basic arithmetic, when trained on extensive text data, even though these tasks are not...
https://openreview.net/forum?id=dsUB4bst9S
arxiv.org
https://arxiv.org/pdf/2506.09251

Seonglae Cho