DALLE-pytorch icon indicating copy to clipboard operation
DALLE-pytorch copied to clipboard

Pretrained models

Open robvanvolt opened this issue 3 years ago • 5 comments

The image generation takes a good amount of time because of the training, as far as I understand.

When the pretrained models are released, how big is the size of the pretrained model, and how long will image generation take then? And how much computing power?

To my understanding, the training of the model usually takes a long time, but with pretrained models the results would be there "in an instant" more or less - or is there more to it?

Best regards

robvanvolt avatar Jan 23 '21 11:01 robvanvolt

@robvanvolt Hello! It is actually not too bad! The model size is reportedly 13B, so actually equivalent to some widely used language models out there. It won't be instantaneous, but it can be made faster through various tricks. It will be vastly less expensive than training of course.

lucidrains avatar Jan 23 '21 17:01 lucidrains

@lucidrains thank you for your quick response!

Will it be possible on pretrained models then to generate images without a CUDA-GPU, e.g. only using an integrated Intel GPU?

And "The model size is reportedly 13B" refers to roughly how many gigabytes of storage space?

robvanvolt avatar Jan 23 '21 19:01 robvanvolt

@robvanvolt probably a number of parameters

batrlatom avatar Jan 24 '21 20:01 batrlatom

@lucidrains thank you for your quick response!

Will it be possible on pretrained models then to generate images without a CUDA-GPU, e.g. only using an integrated Intel GPU?

And "The model size is reportedly 13B" refers to roughly how many gigabytes of storage space?

Model size refers to the number of parameters in the model. The higher the number, the more data in the dataset and therefore the more accurate the model.

Dall-e operates with about 13 billion parameters which is actually small compared to GPT-3's 175 billion. Just imagine how good Dall-e 2 is gonna be :oooo

powderblock avatar Jan 24 '21 22:01 powderblock

I created a repository which sole purpose is to host / collect pretrained models:

https://github.com/robvanvolt/DALLE-models

Here everyone can make their models available, regardless of whether they were trained on a specific or a general dataset. Note that GitHub maximum file size limits the upload of the bigger models, so you need to host it on your own (mega.nz has a free 50GB tier, which is more than enough e.g.). It is a waste of energy if 100 people have to train 100 Dall-E models with the same hyper parameters and the same dataset, so hopefully the collection can give more people access to a broader spectrum of training results!:)

I uploaded an example of my own model here:

https://github.com/robvanvolt/DALLE-models/tree/main/models/taming_transformer/16L_64HD_16H_756I_128T_cc12m_1E

robvanvolt avatar May 06 '21 17:05 robvanvolt