Emote-hack take a look at this

https://github.com/Zejun-Yang/AniPortrait a new open source talking head generation repo.It seems it's very similar to Emote

Mar 26 '24 09:03 fenghe12

I did see it - I'll add to readme. seems a complete rip off from MooreThread animateAnyone - Poseguider https://github.com/MooreThreads/Moore-AnimateAnyone/blob/master/train_stage_1.py#L54 Disappointing thing is - there's no training code - so all the models are locked up.

Mar 26 '24 10:03 johndpope

maybe you can refer to their model code，at least we have that，training code seems just like animate anyone

Mar 26 '24 12:03 fenghe12

looking at the saved models looks less scary than this mess https://github.com/MStypulkowski/diffused-heads/issues/21

can probably just load these into simpler architecture. Screenshot from 2024-03-27 12-51-48

I think the reader /writer is just to have the paralllel unet (to dig out the features from referencenet - reader - and throw to backbone) https://github.com/Zejun-Yang/AniPortrait/blob/main/train_stage_1.py#L53

Mar 27 '24 01:03 johndpope

yes,i also think reader /writer is used to implement The image of the target character is inputted into the ReferenceNet to extract the reference feature maps outputs from the self-attention layers. During the Backbone denoising procedure, the features of corresponding layers undergo a reference-attention layers with the extracted feature maps.(from paper's description) and the second link seems just like pretrained weights from Moore-Animateanyone (another talking head generation repo).Now we're also trying to implement emo,we refer to your existing repo. and I'm finishing Face Locator today.Thanks! ------------------ 原始邮件 ------------------ 发件人: "johndpope/Emote-hack" @.>; 发送时间: 2024年3月27日(星期三) 上午9:55 @.>; @.@.>; 主题: Re: [johndpope/Emote-hack] take a look at this (Issue #31)

looking at the saved models looks less scary than this mess MStypulkowski/diffused-heads#21Screenshot.from.2024-03-27.12-51-48.png (view on web)

can probably just load these into simpler architecture.

I think the reader /writer is just to have the paralllel unet (to dig out the features from referencenet - reader - and throw to backbone) https://github.com/Zejun-Yang/AniPortrait/blob/main/train_stage_1.py#L53

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

Mar 27 '24 02:03 fenghe12

did you see this? https://github.com/johndpope/Emote-hack/issues/28 - I think we can just piggy back off the Alibaba pretrained unet model -

Mar 27 '24 02:03 johndpope

ok ，i'll take a look. we can use pretrained model,in fact Alibaba also uses pretrained model (from hugging face Stable Diffusion v1.5) reference-net and backbone inherit weights from the original SD UNet，only attention layers were changed

------------------ 原始邮件 ------------------ 发件人: "johndpope/Emote-hack" @.>; 发送时间: 2024年3月27日(星期三) 上午10:16 @.>; @.@.>; 主题: Re: [johndpope/Emote-hack] take a look at this (Issue #31)

did you see this? #28 - I think we can just piggy off the Alibaba pretrained unet model -

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

Mar 27 '24 02:03 fenghe12

that's why I'm thinking it will be plug and play. Got all the models for AniPortrait - check this helper out https://github.com/xmu-xiaoma666/External-Attention-pytorch/issues/115

Mar 27 '24 02:03 johndpope

yes,maybe just a combination of existing methods and modules. I'll take a look later.

------------------ 原始邮件 ------------------ 发件人: "johndpope/Emote-hack" @.>; 发送时间: 2024年3月27日(星期三) 上午10:56 @.>; @.@.>; 主题: Re: [johndpope/Emote-hack] take a look at this (Issue #31)

that's why I'm thinking it will be plug and play. Got all the models for AniPortrait - check this helper out xmu-xiaoma666/External-Attention-pytorch#115

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

Mar 27 '24 03:03 fenghe12

the aniportrait is good. I thought the ControlNetMediaPipeFace maybe best solution https://github.com/johndpope/Emote-hack/issues/23

@chris-crucible - it seems that they have enhanced the lips from the base media pipeline maybe worth retraining the model? here's the default sample. https://drive.google.com/file/d/198TWE631UX1z_YzbT31ItdF6yhSyLycL/view?usp=sharing

https://github.com/Zejun-Yang/AniPortrait/blob/bfa15742af3233c297c72b8bb5d7637c5ef5984a/src/utils/draw_util.py#L36


    FACEMESH_LIPS_OUTER_BOTTOM_LEFT = [(61,146),(146,91),(91,181),(181,84),(84,17)]
        FACEMESH_LIPS_OUTER_BOTTOM_RIGHT = [(17,314),(314,405),(405,321),(321,375),(375,291)]
        
        FACEMESH_LIPS_INNER_BOTTOM_LEFT = [(78,95),(95,88),(88,178),(178,87),(87,14)]
        FACEMESH_LIPS_INNER_BOTTOM_RIGHT = [(14,317),(317,402),(402,318),(318,324),(324,308)]
        
        FACEMESH_LIPS_OUTER_TOP_LEFT = [(61,185),(185,40),(40,39),(39,37),(37,0)]
        FACEMESH_LIPS_OUTER_TOP_RIGHT = [(0,267),(267,269),(269,270),(270,409),(409,291)]
        
        FACEMESH_LIPS_INNER_TOP_LEFT = [(78,191),(191,80),(80,81),(81,82),(82,13)]
        FACEMESH_LIPS_INNER_TOP_RIGHT = [(13,312),(312,311),(311,310),(310,415),(415,308)]

I get that there's still expression issues here. but result is quite good.

The head rotations maybe nice branch to get the emotion into video.

Mar 27 '24 07:03 johndpope

Screenshot from 2024-03-29 16-32-04

Mar 29 '24 05:03 johndpope

Have you run through the entire process? Congratulations! Let me take a look at the repo and code！ ------------------ 原始邮件 ------------------ 发件人: "johndpope/Emote-hack" @.>; 发送时间: 2024年3月29日(星期五) 中午1:36 @.>; @.@.>; 主题: Re: [johndpope/Emote-hack] take a look at this (Issue #31)

Screenshot.from.2024-03-29.16-32-04.png (view on web)

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

Mar 29 '24 07:03 fenghe12

python ./scripts/vid2vid.py --config ./configs/prompts/animation_facereenac.yaml -W 512 -H 512 -L 256

animation_facereenac.yaml

pretrained_base_model_path: '/media/2TB/Emote-hack/pretrained_models/StableDiffusion'
pretrained_vae_path: "stabilityai/sd-vae-ft-mse"
image_encoder_path: '/media/oem/12TB/AniPortrait/pretrained_model/image_encoder'


denoising_unet_path: "./pretrained_model/denoising_unet.pth"
reference_unet_path: "./pretrained_model/reference_unet.pth"
pose_guider_path: "./pretrained_model/pose_guider.pth"
motion_module_path: "./pretrained_model/motion_module.pth"

inference_config: "./configs/inference/inference_v2.yaml"
weight_dtype: 'fp16'

test_cases:
  "./configs/inference/ref_images/lyl.png":
    - '/media/2TB/Emote-hack/junk/M2Ohb0FAaJU_1.mp4'

there's no speed embedding - so the vanilla image to video will hold the face in video mostly - but because they're using the animateanyone framework - they get video2video out of the box - allowing this https://drive.google.com/file/d/1HaHPZbllOVPhbGkvV3aHLtcEew9CZGUV/view

Mar 29 '24 22:03 johndpope

hey @fenghe12

i had some success with megaportraits https://github.com/johndpope/MegaPortrait-hack

and now attempting to integrate into VASA on this branch. https://github.com/johndpope/VASA-1-hack/tree/MegaPortraits work on

Jun 11 '24 02:06 johndpope

VASA adopts DiT as backbone denoising network，but it lacks more details about how to integrate conditons into dit. I attempted to replace the unet in Moore animateanyone with DIT (Latte, a video generation model), but the results were not satisfactory. We are now trying to train a talking face video generation model based on OpenSora-plan.

Jun 11 '24 10:06 fenghe12

I guess DiT will be mainstream video generation architecture，because of SORA

Jun 11 '24 10:06 fenghe12

Maybe I can offer some help

hey @fenghe12

i had some success with megaportraits https://github.com/johndpope/MegaPortrait-hack

and now attempting to integrate into VASA on this branch. https://github.com/johndpope/VASA-1-hack/tree/MegaPortraits work on

Jun 11 '24 10:06 fenghe12

i attempt to port matmulfree for llm to pytorch

https://github.com/ridgerchu/matmulfreellm

i send you a link - could be more exciting if i can get cuda code working.

Jun 12 '24 11:06 johndpope

sorry but i can't click your invitation link.It told me 404 error

------------------ 原始邮件 ------------------ 发件人: "John D. @.>; 发送时间: 2024年6月12日(星期三) 晚上7:16 收件人: @.>; 抄送: @.>; @.>; 主题: Re: [johndpope/Emote-hack] take a look at this (Issue #31)

i attempt to port matmulfree for llm to pytorch

https://github.com/ridgerchu/matmulfreellm

i send you a link - could be more excited if i can get cuda code working.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

Jun 13 '24 12:06 fenghe12

sorry that project was 3 days down the gurgler - learned how to compile cuda code.

not sure how to handle the audio stuff - wav2vec - https://github.com/johndpope/VASA-1-hack/tree/MegaPortraits

Jun 14 '24 06:06 johndpope

how can i help you？

Jun 20 '24 07:06 fenghe12