ak01user comments

Results 18 comments of


                                            ak01user

Possibility of reusing some of the facial motion and amplifying movement of selective areas.

mark

Could I input a custom resolution？

you could input [512,768], resolution should be divisible by 32 i think,

add noise scheduler

hi, the model_kwargs look like model_kwargs of inference, can you tell me what x0 is, is gt_frame's vae encode_features?such as 1,sq,4,96,64.

it does not matter,you are so kind,there are good news that I did fine-tune on my own dataset,one person's dancing video with 340 frames.I am just training ['local_image_embedding','local_image_embedding_after'] two blocks'...

add noise scheduler

hi,I tried to fine-tune a specific person's data, but the result was not very good. How should I modify it?I notice that the loss_type is 'mse', and var_type is 'fixed_small',is...

add noise scheduler

hi,after testing, the model does not seem to perform well for side faces and turns. I think it is because the model cannot obtain the information of the clothes on...

Problem about long video generation with first frame condition

hi,can you show your results?maybe it was a question of preprocessing.

Question about `randomref`

sorry,I notice that both random_ref and local_image mode are all from selected refrence image to dance in your code,I did not find much difference. @wangxiang1230

Question about `randomref`

> > sorry,I notice that both random_ref and local_image mode are all from selected refrence image to dance in your code,I did not find much difference. @wangxiang1230 > > Hi,...

Is there a plan to open source the training code?

sorry,what is the v-prediction mean,could you give us some more things about training/finetuning?