DALLE-pytorch icon indicating copy to clipboard operation
DALLE-pytorch copied to clipboard

Using pre-trained text-to-text language model in place of transformer part of the DALLE model.

Open Vbansal21 opened this issue 3 years ago • 0 comments

Thank you for making this repository to imitate OpenAI DALLE. I was thinking that it would be efficient if we just used pre-existing sota text-to-text language model for the transformer part of the DALLE, and have a very deep feed-forward layer as adapter between the vae and transformer, and then only train that adapter layer, with the vae and transformer being frozen during training. Like using GPT-J/Reformer(long sequence lengths would help in generating high res imgs)/Deberta as transformer( although Deberta would also require changing script to support masked language modelling). This will be (in theory) way less compute expensive. What are your thoughts about it?

Vbansal21 avatar Sep 14 '21 06:09 Vbansal21