Make-A-Story
Make-A-Story copied to clipboard
How much VRAM is needed for this?
This looks great!
Could you share some information on what setup you used for the training of the transformer model?
- how many gpu / for how long
- how many steps
- what batch size
It would be helpful to have these information to better understand the cost of training models.
- We used one GPU which is A600.
- Mugen is a large dataset that took longer to train. We used 3 epochs for the dataset. For other two we used 100+ epochs.
- For Mugen, flintstone and proro we used batch size 24, 12 and 16 respectively.