SadTalker icon indicating copy to clipboard operation
SadTalker copied to clipboard

Export in same format of the image size ?

Open salimbenfarhat opened this issue 1 year ago • 4 comments

Hello actually i use this script to make my videos but its not optimized for me :

# selected audio from exmaple/driven_audio
default_head_name = type('', (), {})() # Crée un objet simple pour simuler la structure attendue
default_head_name.value = 'profile' # Remplacez 'mon_image' par le nom réel de votre image source sans l'extension .png

img = 'examples/source_image/{}.png'.format(default_head_name.value)
!python3.8 inference.py --driven_audio ./examples/driven_audio/audio_profile.mp3 \
           --source_image {img} \
           --result_dir ./results --still --preprocess full --enhancer gfpgan

I search solution to export all my videos in 1080x1920. I've not found. Also I want to keep the same resolution, how to edit my command line pease to dos this.

salimbenfarhat avatar Mar 13 '24 06:03 salimbenfarhat

Which option is better to this please ?

usage: inference.py [-h] [--driven_audio DRIVEN_AUDIO] [--source_image SOURCE_IMAGE]
                    [--ref_eyeblink REF_EYEBLINK] [--ref_pose REF_POSE]
                    [--checkpoint_dir CHECKPOINT_DIR] [--result_dir RESULT_DIR]
                    [--pose_style POSE_STYLE] [--batch_size BATCH_SIZE] [--size SIZE]
                    [--expression_scale EXPRESSION_SCALE] [--input_yaw INPUT_YAW [INPUT_YAW ...]]
                    [--input_pitch INPUT_PITCH [INPUT_PITCH ...]]
                    [--input_roll INPUT_ROLL [INPUT_ROLL ...]] [--enhancer ENHANCER]
                    [--background_enhancer BACKGROUND_ENHANCER] [--cpu] [--face3dvis] [--still]
                    [--preprocess {crop,extcrop,resize,full,extfull}] [--verbose] [--old_version]
                    [--net_recon {resnet18,resnet34,resnet50}] [--init_path INIT_PATH]
                    [--use_last_fc USE_LAST_FC] [--bfm_folder BFM_FOLDER] [--bfm_model BFM_MODEL]
                    [--focal FOCAL] [--center CENTER] [--camera_d CAMERA_D] [--z_near Z_NEAR]
                    [--z_far Z_FAR]

salimbenfarhat avatar Mar 13 '24 06:03 salimbenfarhat

Welche Option ist hierfür bitte besser?

usage: inference.py [-h] [--driven_audio DRIVEN_AUDIO] [--source_image SOURCE_IMAGE]
                    [--ref_eyeblink REF_EYEBLINK] [--ref_pose REF_POSE]
                    [--checkpoint_dir CHECKPOINT_DIR] [--result_dir RESULT_DIR]
                    [--pose_style POSE_STYLE] [--batch_size BATCH_SIZE] [--size SIZE]
                    [--expression_scale EXPRESSION_SCALE] [--input_yaw INPUT_YAW [INPUT_YAW ...]]
                    [--input_pitch INPUT_PITCH [INPUT_PITCH ...]]
                    [--input_roll INPUT_ROLL [INPUT_ROLL ...]] [--enhancer ENHANCER]
                    [--background_enhancer BACKGROUND_ENHANCER] [--cpu] [--face3dvis] [--still]
                    [--preprocess {crop,extcrop,resize,full,extfull}] [--verbose] [--old_version]
                    [--net_recon {resnet18,resnet34,resnet50}] [--init_path INIT_PATH]
                    [--use_last_fc USE_LAST_FC] [--bfm_folder BFM_FOLDER] [--bfm_model BFM_MODEL]
                    [--focal FOCAL] [--center CENTER] [--camera_d CAMERA_D] [--z_near Z_NEAR]
                    [--z_far Z_FAR]

Option 1: Using --size for fixed sizes

img = 'examples/source_image/{}.png'.format(default_head_name.value)
!python3.8 inference.py --driven_audio ./examples/driven_audio/audio_profile.mp3 \
           --source_image {img} \
           --result_dir ./results --still --preprocess full --enhancer gfpgan --size 1080,1920

Option 2: Using --preprocess resize for aspect ratio adjustment

img = 'examples/source_image/{}.png'.format(default_head_name.value)
!python3.8 inference.py --driven_audio ./examples/driven_audio/audio_profile.mp3 \
           --source_image {img} \
           --result_dir ./results --still --preprocess resize --enhancer gfpgan

AppStolz avatar Apr 14 '24 03:04 AppStolz

Hello, i am confused with this problem too. I am working woth an image size of 2000*2000. By adding --size 2000 2000 at the end of my command (Option 1) , i got an error :unrecognized arguments: 2000. Then, I added merely --size 2000. The error turned into: FileNotFoundError: No such file or directory: "./checkpoints\SadTalker_V0.0.2_2000.safetensors". Do you know how to solve this issue? Many thanks

Zeelyne avatar Apr 18 '24 08:04 Zeelyne

Hello, i am confused with this problem too. I am working woth an image size of 2000*2000. By adding --size 2000 2000 at the end of my command (Option 1) , i got an error :unrecognized arguments: 2000. Then, I added merely --size 2000. The error turned into: FileNotFoundError: No such file or directory: "./checkpoints\SadTalker_V0.0.2_2000.safetensors". Do you know how to solve this issue? Many thanks

Hey @Zeelyne,

I noticed that you're facing some issues with running the SadTalker inference script. I've identified a couple of potential issues with your command and have provided a solution below:

Fix for Unrecognized Arguments and FileNotFoundError

  1. Size Argument Format: The --size argument should be in a comma-separated format like --size 2000,2000 instead of --size 2000 2000.

  2. Checkpoint File: Ensure that the checkpoint file SadTalker_V0.0.2_2000.safetensors is present in the ./checkpoints directory.

Updated Command

Here's the updated command:

img = 'examples/source_image/{}.png'.format(default_head_name.value)
!python3.8 inference.py --driven_audio ./examples/driven_audio/audio_profile.mp3 \
           --source_image {img} \
           --result_dir ./results --still --preprocess full --enhancer gfpgan --size 2000,2000

AppStolz avatar Apr 22 '24 17:04 AppStolz