Tim Isbister
Tim Isbister
Wondering the same. Suprised this hasnt been demonstrated!
But do we really need to manually add domain specific (out of vocab words?). Isn't the purpose with word pieces that they can in theory construct new words with combining...
Same here; ``` deepspeed.runtime.zero.utils.ZeRORuntimeException: The checkpoint being loaded used a DP world size of 7 but the current world size is 8. Automatic adjustment of ZeRO's optimizer state partitioning with...
I will look at this!
# gpt-sw3-20b ```html {"dataset": "swerec", "task": "sentiment-classification", "dataset_languages": ["sv"], "model": "AI-Sweden-Models/gpt-sw3-20b", "results": {"raw": {"test": [{"mcc": 0.7889415012271835, "macro_f1": 0.7977647031307488}, {"mcc": 0.7482306507426133, "macro_f1": 0.7624724867558679}, {"mcc": 0.7385365394017152, "macro_f1": 0.7661884945460464}, {"mcc": 0.7953827412453802, "macro_f1": 0.8079224208867918},...
# gpt-sw3-20b-instruct ``` {"dataset": "swerec", "task": "sentiment-classification", "dataset_languages": ["sv"], "model": "AI-Sweden-Models/gpt-sw3-20b-instruct", "results": {"raw": {"test": [{"mcc": 0.6906651751805405, "macro_f1": 0.7226691138787769}, {"mcc": 0.6822053393033132, "macro_f1": 0.7131633530990946}, {"mcc": 0.5852277661641855, "macro_f1": 0.6392850178395086}, {"mcc": 0.7515938029833797, "macro_f1": 0.7649167883743853},...