DALLE-models icon indicating copy to clipboard operation
DALLE-models copied to clipboard

hosting weights on github repository release

Open fcakyon opened this issue 4 years ago • 5 comments

Why not hosting pretrained weights in github repository release? You can upload files having GBs of size into repository release. And using a simple get request function, weights can be downloaded into local. It would free us from the unnecessary wandb dependency when downloading a pretrained weight.

I can show you an example if you desire to go on this direction 👍

fcakyon avatar Jul 08 '21 21:07 fcakyon

Sounds really promising! Would you like to commit a pull request for a model in the transformer folder?

robvanvolt avatar Jul 09 '21 08:07 robvanvolt

I feel like this would go great with the newest "general" dall-e models.

johnpaulbin avatar Jul 14 '21 02:07 johnpaulbin

Okay well PR doesn't transfer releases 😅 should have seen this coming. Anyway I've DMed you on discord how do setup .pt files onto github via releases. Here is 3E: https://github.com/johnpaulbin/DALLE-models/releases/tag/model

johnpaulbin avatar Jul 14 '21 03:07 johnpaulbin

Okay well PR doesn't transfer releases 😅 should have seen this coming. Anyway I've DMed you on discord how do setup .pt files onto github via releases. Here is 3E: https://github.com/johnpaulbin/DALLE-models/releases/tag/model

May I ask, in what configuration was the model trained, and for how long was the 3EPOCH trained?

ckmstydy avatar Aug 30 '21 05:08 ckmstydy

Sure!:)

The name-coding can also be found in https://github.com/robvanvolt/DALLE-models/readme.md

16L_64HD_8H_512I_128T_cc12m_cc3m_3E

means 16 layers, 64 head dimensions, 8 heads, 512 image dimensions, 128 text dimensions trained on cc12m and cc3m for 3 epochs, which was around one week on a 3090 RTX.

robvanvolt avatar Aug 30 '21 10:08 robvanvolt