DialoGPT
DialoGPT copied to clipboard
Incorrect DSTC medium model link: pkl file is the small model
Hi there, thanks for sharing your amazing work on github. Just wanted to point out that the link (https://convaisharables.blob.core.windows.net/lsp/DSTC/medium_ft.pkl) shared in readme for DSTC medium model is the small GPT-2 version.
Its size is 351.3MB rather than 863MB for medium GPT-2. When loading the pkl files, there are only 12 transformer blocks with hidden state size of 768 rather than medium's 1024.
Would be great if you can share the corrected DSTC medium model link! Thanks!
Hi, thanks for the feedback. We use 3 different model sizes: 117M (small), 345M(medium) and 762M(large), which might not correspond to the size of GPT-2. We don't have the 863M model available for DSTC.