stablediffusion icon indicating copy to clipboard operation
stablediffusion copied to clipboard

Why not use text encoder like GPT?

Open eeyrw opened this issue 2 years ago • 0 comments

According paper of Google Imagen, increasing text encoder capacity can help a lot to generation performance, which they use T5-XXL as text encoder. Although T5-XXL is too big to apply in personal computer. But GPT Neo 1.3B/2.7B is trained on 800GB corpus and is not too big. I think it should improve the model understanding ability of natural language comparing with CLIP.

eeyrw avatar Jan 13 '23 08:01 eeyrw