Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Data/Dataset/NLP Dataset/Web Dataset/
RedPajama
Search

RedPajama

Creator
Creator
Seonglae Cho
Created
Created
2023 May 1 7:33
Editor
Editor
Seonglae Cho
Edited
Edited
2025 Mar 2 15:31
Refs
Refs
MPT
Size
Size
TB
Multilingual
Multilingual
Multilingual
 
 
 
 
 
togethercomputer/RedPajama-Data-1T · Datasets at Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
togethercomputer/RedPajama-Data-1T · Datasets at Hugging Face
https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T
togethercomputer/RedPajama-Data-1T · Datasets at Hugging Face
RedPajama, a project to create leading open-source models, starts by reproducing LLaMA training dataset of over 1.2 trillion tokens — TOGETHER
RedPajama is a project to create a set of leading, fully open-source models. Today, we are excited to announce the completion of the first step of this project: the reproduction of the LLaMA training dataset of over 1.2 trillion tokens.
RedPajama, a project to create leading open-source models, starts by reproducing LLaMA training dataset of over 1.2 trillion tokens  — TOGETHER
https://www.together.xyz/blog/redpajama
RedPajama, a project to create leading open-source models, starts by reproducing LLaMA training dataset of over 1.2 trillion tokens  — TOGETHER
 
 

Recommendations

Texonom
Texonom
/
Engineering
Engineering
/Data Engineering/Artificial Intelligence/AI Data/Dataset/NLP Dataset/Web Dataset/
RedPajama
Copyright Seonglae Cho