Continuing Pre Training from Model Checkpoint
Hi, I pre-trained a language model for my own data and I want to continue the pre-training for additional steps using the last checkpoint. I am planning to use the code below to continue the pre-training but want to be sure that everything is correct before starting. Let’s say that I saved all of my files into CRoBERTa. model = RobertaForMaskedLM.from_pretrained(‘CRoBERTa/checkpoint-…’) tokenizer = RobertaTokenizerFast.from_pretrained(‘CRoBERTa’, max_len = 512, padding = ‘longest’) training...
https://discuss.huggingface.co/t/continuing-pre-training-from-model-checkpoint/11392