nebuly
nebuly copied to clipboard
[Chatllama] Merge the datasets to create more insightful training data
Description
Currently the dataset supported can be used alternatively to each other, It would be nice to add diversity in the training data to select recipes to merge this dataset and create more insightful trainings.
TODO
- [ ] Define what parameters needs to be specified to create a "recipe" for the dataset and add them to the config files
- [ ] Expand the dataset class to allow parameters from the config file to generate the appropriate dataset mixture.
- [ ] Evaluate the possible increase in model quality due to different "recipes" used.
Hi there! I want to work on this issue
Hello @mohsinmahmood12, thanks a lot for the interest in ChatLLaMA. I assigned the issue to you! Let us know if you face any difficulties with the task 😄