DALLE-pytorch
DALLE-pytorch copied to clipboard
Using pre-trained text-to-text language model in place of transformer part of the DALLE model.
Thank you for making this repository to imitate OpenAI DALLE. I was thinking that it would be efficient if we just used pre-existing sota text-to-text language model for the transformer part of the DALLE, and have a very deep feed-forward layer as adapter between the vae and transformer, and then only train that adapter layer, with the vae and transformer being frozen during training. Like using GPT-J/Reformer(long sequence lengths would help in generating high res imgs)/Deberta as transformer( although Deberta would also require changing script to support masked language modelling). This will be (in theory) way less compute expensive. What are your thoughts about it?