neverix
neverix
That is not Stable Diffusion, it's an older model that has been available since April at https://github.com/CompVis/latent-diffusion
Yes this specific checkpoint is causing a lot of confusion
Have you gotten it to work?
Nice, this will be useful for porting audio and pose models
Right now the code can't just do forward over all tokens because of the caching implementation. It needs to run through every token instead of just masking the attention
#80 solves this
Look at [the scripts](https://github.com/CompVis/taming-transformers/blob/master/scripts/reconstruction_usage.ipynb), they're pretty helpful
I think I finally figured it out. 1) `!pip install mesh-transformer-jax/ jax==0.2.12 tensorflow==2.5.0 chex==0.0.6 jaxlib==0.3.7` 2) ``` #@title Patch 1 %%file /usr/local/lib/python3.7/dist-packages/chex/_src/pytypes.py # Lint as: python3 # Copyright 2020 DeepMind...
Ideally there would be a converter + ignoring mismatching inputs/outputs in case the model has a different amount of channels
I made a similar fix, can confirm that this works