Reinforcement learning 아니지만 wrapper로 들어있다 max_seq_length - min(tokenizer.model_max_length, 1024)packing - Dataset Packingformatting_func Supervised Fine-tuning TrainerWe’re on a journey to advance and democratize artificial intelligence through open source and open science.https://huggingface.co/docs/trl/sft_trainer