Open-Assistant
Open-Assistant copied to clipboard
Update WizardLM dataset
SFT-8 training is using SFT-8 training is using a slightly less cleaned version
Beyond SFT-8 we should replace with the newer, more cleaned version
I think this is just a matter of changing the HF dataset ID
Is this still a current issue? I tried to update the dataset but the whole structure of the dataset changes. Am also not sure if this does not include vicuna data, which might result in duplicates if training with wizardlm + vicuna