DialoGPT icon indicating copy to clipboard operation
DialoGPT copied to clipboard

Incorrect DSTC medium model link: pkl file is the small model

Open alvinchangw opened this issue 4 years ago • 1 comments

Hi there, thanks for sharing your amazing work on github. Just wanted to point out that the link (https://convaisharables.blob.core.windows.net/lsp/DSTC/medium_ft.pkl) shared in readme for DSTC medium model is the small GPT-2 version.

Its size is 351.3MB rather than 863MB for medium GPT-2. When loading the pkl files, there are only 12 transformer blocks with hidden state size of 768 rather than medium's 1024.

Would be great if you can share the corrected DSTC medium model link! Thanks!

alvinchangw avatar May 07 '20 03:05 alvinchangw

Hi, thanks for the feedback. We use 3 different model sizes: 117M (small), 345M(medium) and 762M(large), which might not correspond to the size of GPT-2. We don't have the 863M model available for DSTC.

dreasysnail avatar Jun 12 '20 22:06 dreasysnail