ELLA
ELLA copied to clipboard
Training Details
First of all, thanks for the remarkable work! I've a few questions about the training details:
- Batch size and steps of training SD1.5 on 512 res
- Batch size and steps of training SDXL on 512 & 1024 res
- Is 34M training data size necessary? Have you tried training on 0.1M, 1M, 10M oens?
- How is the 100K 1024 res data collected?
- What's the prompt used for data recaption?
Thanks
I am training SDXL ELLA so I can answer a few.
Batch size and steps of training SDXL on 512 & 1024 res
As big as you can get ideally.
Is 34M training data size necessary? Have you tried training on 0.1M, 1M, 10M oens?
34 M is actually maybe too small. I have tried training on smaller amounts and if the concept doesn't exist in your training data the adapter may struggle with reproducing it, see: https://github.com/TencentQQGYLab/ELLA/issues/35
What's the prompt used for data recaption?
You can just use any major mllm like llava 1.6 and "Describe this image in detail".
@AmericanPresidentJimmyCarter Hello, I would appreciate it if you could share your training script and code,thanks a lot!
@AmericanPresidentJimmyCarter Hello, I would appreciate it if you could share your training script and code,thanks a lot!
looks like training scripts are here: https://github.com/DataCTE/ELLA_Training and https://github.com/TencentQQGYLab/ELLA/pull/27
@AmericanPresidentJimmyCarter Hello, I would appreciate it if you could share your training script and code,thanks a lot!
If it ever works well I will release the weights. I don't have anything like the results from the paper so far.