llm-foundry
llm-foundry copied to clipboard
Data validation notebook
- add notebook/data_validation_notebook which runs data preparation and token counting from byod/data_validation branch. Merged to main to keep underlying functions up-to-date.
- add utils functions used by notebook/data_validation_notebook
- shuffle functions in convert_text_to_mds to data prep utils with minor modifications