llm-foundry
llm-foundry copied to clipboard
Enable streaming of local finetuning dataset
Current path for streaming of finetuning datasets does not allow for streaming from local path (which works for text datasets out of the box and is also supported by StreamingFinetuningDataset class). Diff just updates the way local is considered when building finetuning dataloaders. It has been tested on MPT-7B finetuning from local streaming dolly_hhrlhf dataset.
Hey @eldarkurtic , thanks for the change! Could you please run pre-commit run --all-files locally to apply the auto formatting?