Zineng Tang

Results 40 comments of Zineng Tang

I am assuming you are using the last checkpoint the run generated instead of intermediate checkpoint? If so, try using more epochs. If it still doesn't work, I will provide...

What is your transformers version? Did you install requirements.txt

I think so. The transformers version in requirements.txt is 4.26.0. The higher version can cause some mismatch to the code.

The MAE checkpoint is together with the transformer weights included in the checkpoint. if you want the original MAE weights you can download it from the original MAE codebase.

Is your GPU memory below 30GB ? If not, 32GB mem or 40GB mem is enough

Let me try to make a PR to incorporate fp16 and it should be less than 24GB mem.

you can try deepspeed stage 3 for model parameter states partitioning

so in using deepspeed stage 3, the model parameters states will be split among gpus and thus reducing the model memory. deepspeed is supported in pytorch, pytorch lightning, etc.

Can you give me a text prompt and the generated audio and random seed. So, I know it matches expected output.