DeepSpeedExamples
DeepSpeedExamples copied to clipboard
Fixed the dropout setting bug in DeepSpeed SQuDA fine-tune code
It seems there is a bug in our DeepSpeed SQuDA finetune code. There are duplicated keys on dropout probability settings in the model configuration file. With the bug, it is possible that the second key-value pair overwrite the first one given the same key value, making setting dropout not really do anything in the current script.