Real3DPortrait How to fine-tune the pre-trained model on my dataset?

Hello, I would like to ask how can I load the pre-trained model and fine-tune it on my self-collected dataset?

Jul 09 '24 02:07 RayShing

Hi, you can use the init_from_ckpt option.

Jul 09 '24 05:07 yerfor

Hi, you can use the init_from_ckpt option.

Thanks!

I have another question regarding the pre-trained models you provided. Specifically, you included "audio2secc_vae" and "secc2plane_torso_orig". However, in your training guidelines for audio, it is recommended to first train "audio_lm3d_syncnet" and then "audio2motion". Similarly, for motion, the guideline suggests first training "Img-to-Plane" followed by "Motion-to-Video", which includes "secc2plane_head" and "secc2plane_torso".

I am a bit confused about their relationships. Are "audio2secc_vae" equivalent to "audio2motion" and "secc2plane_torso_orig" equivalent to "secc2plane_torso"?

For audio training, should I:

Train "audio_lm3d_syncnet" myself, and then
When training "audio2motion", provide the checkpoints from both my trained "audio_lm3d_syncnet" and the provided "audio2secc_vae"?

Or, do I not have to train "audio_lm3d_syncnet" at all and just provide "audio2secc_vae" for fine-tuning?

Similarly, for Motion-to-Video training, should I:

Train "Img-to-Plane" myself
Train "secc2plane_head" myself, based on trained "Img-to-Plane"
When training "secc2plane_torso", provide the checkpoints from both my trained "secc2plane_head" and the provided "secc2plane_torso_orig"?

But seems we can only set one checkpoint for "init_from_ckp"?

Additionally, does "secc2plane_head" imply inferring only the head area without the torso?

Thank you so much for your help!

Jul 09 '24 06:07 RayShing

Yes, "audio2secc_vae" equivalent to "audio2motion" and "secc2plane_torso_orig" equivalent to "secc2plane_torso"
For audio training, should I ==> Yes, you need to train a syncnet.
You can skip the image-to-plane pre-training, and go through the init_from_ckpt => secc2plane_head => secc2plane_torso.
does "secc2plane_head" imply inferring only the head area without the torso? ==> Yes

Jul 09 '24 07:07 yerfor

Thank you so much for your response! I am still a bit confused about this step:

You can skip the image-to-plane pre-training, and go through the init_from_ckpt => secc2plane_head => secc2plane_torso.

Where can we get the pre-trained model for image-to-plane? It appears that currently, we only have the pre-trained models for "audio2motion" and "secc2plane_torso".

Additionally, I noticed that during evaluation, the human figure changes each time instead of using the one I provided. Where is this part of the setup, and how can we modify it to use my provided human figure?

Thank you for your time!

Jul 09 '24 14:07 RayShing

you can use the provided pre-trained secc2plane_torso to initialize you own secc2plane_head model, just set strict=False.

For using your provided human figure, please modify the code in validation_steps

Jul 09 '24 14:07 yerfor

you can use the provided pre-trained secc2plane_torso to initialize you own secc2plane_head model, just set strict=False.

For using your provided human figure, please modify the code in validation_steps

Thank you for your reply!

I have modified the training logic. However, when I tried to train the secc2plane_head model on my 4090 GPU, I encountered the OOM issue. Is there any way to reduce the GPU memory requirement during training? I tried to reduce "num_workers" but it did not work

Jul 09 '24 16:07 RayShing

You can reduce the batch_size, or you can try amp=True

Jul 09 '24 18:07 yerfor

@yerfor Hi, Thank you so much for your wonderful work. I was wondering if you could also release a public avaliable model of the syncnet, so we can finetune on our dataset much easier?

Aug 19 '24 08:08 moliq1

@felixshing Hello, I would like to inquire about your experience. Did you achieve the desired results? I have about 10 minutes of training data for each character on my end. Is that enough?

Feb 06 '25 02:02 jupinter

Real3DPortrait Real3DPortrait copied to clipboard

How to fine-tune the pre-trained model on my dataset?

Real3DPortrait
Real3DPortrait copied to clipboard