Pablo Pernias
Pablo Pernias
Guess the official repo is out: https://github.com/google-research/maskgit (although it seems to be in JAX) Let's find out (and share here please 🙏 ) what was missing in our implementations 🙇
Not sure how accurate these results are, but when I plot the memory usage with respect to the sequence length of a model with this setup `dim = 128, depth...
That would be awesome. What I have tried to speed up the inference in my custom implementations for autoregressive self-attention is caching the output of the self-attention at timestep T...
What do you mean by 'fine-tune'? Training a vanilla transformer, then replacing the attention layers with performer attention layers and do some more training?
I will try that, thanks! Any idea about what could be the expected speedup?
Awesome, thanks!
Well, I did mine in a custom Transformer implementation, what I basically did is the following: The attention layers generally receive Q, K & V and output a sequence, I...
The diffusers branch is currently broken, will be fixed very soon, meanwhile you can install it from an older commit: ``` pip3 install git+https://github.com/kashif/diffusers.git@a3dc21385b7386beb3dab3a9845962ede6765887 ```
In order to use the small stage C, the field "model version" should be set to `1B` in the config Yaml (and `700M` for stageB) https://github.com/Stability-AI/StableCascade/blob/209a52600f35dfe2a205daef54c0ff4068e86bc7/train/train_c.py#L152 https://github.com/Stability-AI/StableCascade/blob/209a52600f35dfe2a205daef54c0ff4068e86bc7/train/train_b.py#L167
Any update on this? If not I'll close the issue