batched_dict data { column1: [] column2: [] }
dataset.map(with_rank=True, with_indices=True)
dataset.batch(batch_size=32) # Iterate over the batched dataset for batch in batched_dataset: print(batch) break
Main classes
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
https://huggingface.co/docs/datasets/package_reference/main_classes#datasets.DatasetDict.map
Stream
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
https://huggingface.co/docs/datasets/stream

Seonglae Cho