ELITE
ELITE copied to clipboard
ELITE: Encoding Visual Concepts into Textual Embeddings for Customized Text-to-Image Generation (ICCV 2023, Oral)
I use 4 4090GPUs, each has 24GB memory, but I get "out of memory",could you tell me about something about your device.
Please tell me why I ran app.py and uploaded the image. It took nearly an hour for the image to load but no result was generated. Is it because the...
Add ip-adapter to ELITE that is based on stabel diffusion v1-5.
Hi, Thanks for the codebase. I was wondering if it's possible to modify the code to make it invert images without foreground masks like the original textual inversion paper. Otherwise,...
Thanks for your great work! I'm curious about the processing about Visualization of learned word embeddings. Is it using the layers selected from the CLIP image encoder and then input...
DINO-I
Hello, do you have a specific reference code for the indicator dino-i? I am looking forward to your reply
Thank you for your great work, but I'm confused about the Table 1. Ablation study in the paper. I didn't figure out whether the method "Multi-Layers Multi-Words [v]"(highest CLIP-I and...
Hello, may I require how do you decide to select the features from five layers {24, 4, 8, 12, 16} rather than other layers
Hi Author, Thank you for sharing this wonderful work! I’m interested in running the code and validating the results. However, I couldn’t find the test data or the code used...