PIRender icon indicating copy to clipboard operation
PIRender copied to clipboard

Motion Imitation custom images

Open molo32 opened this issue 4 years ago • 14 comments

How motion imitation custom images

molo32 avatar Sep 23 '21 18:09 molo32

Hi, The cross-identity reenactment method will be provided in the next few days. You can use the code to animate your images. Yurui

RenYurui avatar Sep 24 '21 09:09 RenYurui

Hi, @RenYurui so awesome project, congratulations to you. I have tested the cross-identity reenactment, but the result is not good about the identity preservation problem, as shown as the following:

https://user-images.githubusercontent.com/67719349/136723007-4e4fa93e-a986-4778-8714-be8e2deb3458.mp4

Can you help me fix the problem?

clumsynope avatar Oct 11 '21 02:10 clumsynope

hi @clumsynope, can you tell me how to run the cross-identity? are you using a custom image?

loboere avatar Oct 11 '21 18:10 loboere

Hi, @RenYurui so awesome project, congratulations to you. I have tested the cross-identity reenactment, but the result is not good about the identity preservation problem, as shown as the following:

out.mp4 Can you help me fix the problem?

I have fixed the problem. Just replace the crop parameter with the source's crop parameter.

kjzju avatar Oct 12 '21 07:10 kjzju

@RenYurui Thanks you share your greate work. I have tested the cross-identity reenactment,but not very good.

https://user-images.githubusercontent.com/41478810/137077395-fc8f50e0-d2d4-44c0-ba33-0e8803de2a52.mp4

DWCTOD avatar Oct 13 '21 06:10 DWCTOD

Hi, @RenYurui so awesome project, congratulations to you. I have tested the cross-identity reenactment, but the result is not good about the identity preservation problem, as shown as the following: out.mp4 Can you help me fix the problem?

I have fixed the problem. Just replace the crop parameter with the source's crop parameter.

Hi, would mind share details about how to replace the crop parameter with the source's crop parameter. Thanks

DWCTOD avatar Oct 13 '21 07:10 DWCTOD

Hi, @RenYurui so awesome project, congratulations to you. I have tested the cross-identity reenactment, but the result is not good about the identity preservation problem, as shown as the following: out.mp4 Can you help me fix the problem?

I have fixed the problem. Just replace the crop parameter with the source's crop parameter.

Hi, would mind share details about how to replace the crop parameter with the source's crop parameter. Thanks

In Line 33 of vox_video_dataset.py, replace the last 3 dimensions of semantics_numpy with the source's, like semantics_numpy[-3:]=source_semantics_numpy[-3:]. As for source_semantics_numpy, you can get it like how you get semantics_numpy.

https://user-images.githubusercontent.com/26479528/137242017-e1a5dcd6-bda1-446f-b0b7-e3f40d72e35a.mp4

kjzju avatar Oct 14 '21 02:10 kjzju

Hi, @RenYurui so awesome project, congratulations to you. I have tested the cross-identity reenactment, but the result is not good about the identity preservation problem, as shown as the following: out.mp4 Can you help me fix the problem?

I have fixed the problem. Just replace the crop parameter with the source's crop parameter.

Hi, would mind share details about how to replace the crop parameter with the source's crop parameter. Thanks

In Line 33 of vox_video_dataset.py, replace the last 3 dimensions of semantics_numpy with the source's, like semantics_numpy[-3:]=source_semantics_numpy[-3:]. As for source_semantics_numpy, you can get it like how you get semantics_numpy.

id10283.vaK4t1-WD4M.031553.031737.mp4

Thanks

DWCTOD avatar Oct 14 '21 03:10 DWCTOD

@kjzju

你好,我按照你的提示修改并测试了一下, 感觉还是有一些问题。cross id 的时候,外貌特征也会被迁移过去。 感觉 ID 信息并没有传递过去,下面是 vox_dataset.py 中的代码,不知道大佬具体是如何修改的。 ` def transform_semantic(self, semantic, frame_index): index = self.obtain_seq_index(frame_index, semantic.shape[0]) coeff_3dmm = semantic[index,...] # id_coeff = coeff_3dmm[:,:80] #identity ex_coeff = coeff_3dmm[:,80:144] #expression # tex_coeff = coeff_3dmm[:,144:224] #texture angles = coeff_3dmm[:,224:227] #euler angles for pose # gamma = coeff_3dmm[:,227:254] #lighting translation = coeff_3dmm[:,254:257] #translation crop = coeff_3dmm[:,257:300] #crop param coeff_3dmm = np.concatenate([ex_coeff, angles, translation, crop], 1) return torch.Tensor(coeff_3dmm).permute(1,0)

`

DWCTOD avatar Oct 22 '21 03:10 DWCTOD

I just modify the code in vox_video_dataset.py like this:

def load_next_video(self):
    data={}
    self.video_index += 1
    video_item = self.video_items[self.video_index]

    video_item_src = self.video_items[random.randint(0, len(self.video_items))] # random select another source video

    with self.env.begin(write=False) as txn:
        # key = format_for_lmdb(video_item['video_name'], 0)
        # img_bytes_1 = txn.get(key)
        # img1 = Image.open(BytesIO(img_bytes_1))
        # data['source_image'] = self.transform(img1)

        # cross-identity
        key = format_for_lmdb(video_item_src['video_name'], 0)
        img_bytes_1 = txn.get(key)
        img1 = Image.open(BytesIO(img_bytes_1))
        data['source_image'] = self.transform(img1) # source image

        semantics_key = format_for_lmdb(video_item_src['video_name'], 'coeff_3dmm')
        semantics_numpy = np.frombuffer(txn.get(semantics_key), dtype=np.float32)
        semantics_numpy = semantics_numpy.reshape((video_item_src['num_frame'], -1))
        source_semantics = self.transform_semantic(semantics_numpy, 0) # source semantic numpy

        semantics_key = format_for_lmdb(video_item['video_name'], 'coeff_3dmm')
        semantics_numpy = np.frombuffer(txn.get(semantics_key), dtype=np.float32)
        semantics_numpy = semantics_numpy.reshape((video_item['num_frame'],-1)) # target semantic numpy

        data['target_image'], data['target_semantics'] = [], []
        for frame_index in range(video_item['num_frame']):
            key = format_for_lmdb(video_item['video_name'], frame_index)
            img_bytes_1 = txn.get(key)
            img1 = Image.open(BytesIO(img_bytes_1))
            data['target_image'].append(self.transform(img1))
            target_semantics = self.transform_semantic(semantics_numpy, frame_index)

            target_semantics[-3:] = source_semantics[-3:] # replace the crop parameters of the target's with the source's

            data['target_semantics'].append(
                # self.transform_semantic(semantics_numpy, frame_index)
                target_semantics
            )
        data['video_name'] = video_item['video_name']
    return data

kjzju avatar Oct 25 '21 03:10 kjzju

@kjzju Thanks, I will try again !

DWCTOD avatar Oct 27 '21 02:10 DWCTOD

@DWCTOD have you figured out the issue about the cross identity motion imitation? I do agree with you that the identity information of the source image is not passed to generator model , and the result seems not preserving the source identity very well. @RenYurui, Have you tried to pass an extra more source identity coefficient to the generator model? Hope for your reply, thanks.

josh-zhu avatar Nov 26 '21 13:11 josh-zhu

@DWCTOD have you figured out the issue about the cross identity motion imitation? I do agree with you that the identity information of the source image is not passed to generator model , and the result seems not preserving the source identity very well. @RenYurui, Have you tried to pass an extra more source identity coefficient to the generator model? Hope for your reply, thanks.

你好,这个问题还没有解决, 但是我猜测可能这部分的工作还没有分享出来,简单的通过3DMM参数传递应该是得不到作者demo的效果,应该是需要一个facial motion retarget的模块才能将cross-id 的运动比较自然有效的迁移过去

DWCTOD avatar Dec 08 '21 08:12 DWCTOD

how use custom images instead video_item['video_name']

loboere avatar Dec 14 '21 20:12 loboere