llm-foundry Enable streaming of local finetuning dataset

Enable streaming of local finetuning dataset

Open eldarkurtic opened this issue 1 year ago • 1 comments

Current path for streaming of finetuning datasets does not allow for streaming from local path (which works for text datasets out of the box and is also supported by StreamingFinetuningDataset class). Diff just updates the way local is considered when building finetuning dataloaders. It has been tested on MPT-7B finetuning from local streaming dolly_hhrlhf dataset.

Jan 15 '24 11:01 eldarkurtic

Hey @eldarkurtic , thanks for the change! Could you please run pre-commit run --all-files locally to apply the auto formatting?

Jan 17 '24 00:01 dakinggg

llm-foundry llm-foundry copied to clipboard

Enable streaming of local finetuning dataset

llm-foundry
llm-foundry copied to clipboard