Results 32 comments of Yuzhong Zhao

Your results at cub seem to be very low. The possible reason is that the model has overfitted in training stage 2. What's the performance of the model after training...

**Reply 1.** Based on the result, it seems that you have successfully reproduced the paper result, i.e., `98.2` gt-known loc on CUB, ovo. **Reply 2.** In fact, you can also...

1. We find that `fr` is effective on more difficult imagenet, while less effective on cub. 2. This json file is the classification result of a classification network (finetuned) in...

The most likely reason is that the `register_attention_control` function at line 67 of `attn.py` is not working properly. In line 115 of `attn.py,` we replace the `get_attention_scores` method for all...

1、try `pip install --upgrade diffusers[torch]==0.13.1`, which is the version we use. 2、Check whether the code runs through the `get_attention_scores` method at line 71 of `attn.py`. This method adds attention maps...

At line 106 in `attn.py`, `attention_probs = controller(attention_probs, is_cross, place_in_unet)` add the `attention_probs` into the `self.step_store` in the controller, you can check if the code goes through this line.

Since CLIP(text encoder) is frozen all the time, it seems that there is a problem with the representative embeddings trained in stage 1. Does the model you trained in stage...

1、`fr` relies only on the train_token. 2、VQGAN is an improved version of VAE, and they are similar in structure.

Instead of choosing the model that achieves the best performance based on the performance of the validation/test set, we choose the model saved with the last epoch. If the performance...

The code can run on RTX 3090 with memory of 24GB, it seems that your machine is running out of memory (only 7.06 GB already allocated). You can free up...