nanoChat

Creator

Creator

Seonglae Cho

Created

Created

2025 Oct 14 21:31

Editor

Editor

Seonglae Cho

Edited

Edited

2025 Oct 14 21:45

Refs

Refs

karpathy • Updated 2025 Oct 14 21:30

Tokenizer Training (Rust BPE)

Train BPE tokenizer on FineWeb-EDU data
Configure vocab + special tokens

Base Pretraining

Train general language model on large-scale text (FineWeb-EDU)
Next-token prediction objective

Mid-training (Intermediate Adaptation Stage)

Use mixed data from SmolTalk (conversation), MMLU, GSM8K
Expand model's internal reasoning and world knowledge without masking and chat special token

SFT (Supervised Fine-Tuning, Chat Format)

Use user ↔ assistant conversation format data
Masking user messages, backpropagate assistant only

Optional RL (GRPO / REINFORCE)

Update model based on rewards for GSM8K problems
Generate multiple answer samples → apply policy gradient

HarleyCooper/nanochat561 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

HarleyCooper/nanochat561 · Hugging Face

https://huggingface.co/HarleyCooper/nanochat561

HarleyCooper/nanochat561 · Hugging Face

Recommendations

/////////