xmtf
xmtf copied to clipboard
Quesiton about MTFDataset
https://github.com/bigscience-workshop/Megatron-DeepSpeed/blob/main/megatron/data/mtf_dataset.py#L34
The MTFDataset class take documents
as arguments, but didn't use it(except in assert statement).
I think documents
is train/valid/test split index, is it ok to ignore documents
?