stable-diffusion-videos
stable-diffusion-videos copied to clipboard
Add CLI script
This PR adds a script scripts/make_video.py
to make videos from the command line, for those like me who prefer that over notebooks, to e.g. run from a cluster node. The script takes as argument most if not all the arguments featured in README.md
. Help message looks like this:
$ python scripts/make_video.py --help
usage: make_video.py [-h] [--checkpoint_id CHECKPOINT_ID] [--prompts PROMPTS [PROMPTS ...]] [--seeds SEEDS [SEEDS ...]]
[--num_interpolation_steps NUM_INTERPOLATION_STEPS [NUM_INTERPOLATION_STEPS ...]] [--output_dir OUTPUT_DIR] [--name NAME] [--fps FPS]
[--guidance_scale GUIDANCE_SCALE] [--num_inference_steps NUM_INFERENCE_STEPS] [--height HEIGHT] [--width WIDTH] [--upsample]
[--batch_size BATCH_SIZE] [--audio_filepath AUDIO_FILEPATH] [--audio_offsets AUDIO_OFFSETS [AUDIO_OFFSETS ...]]
[--negative_prompt NEGATIVE_PROMPT] [--cfg CFG]
options:
-h, --help show this help message and exit
--checkpoint_id CHECKPOINT_ID
checkpoint id on huggingface (default: stabilityai/stable-diffusion-2-1)
--prompts PROMPTS [PROMPTS ...]
sequence of prompts (default: None)
--seeds SEEDS [SEEDS ...]
seed for each prompt (default: None)
--num_interpolation_steps NUM_INTERPOLATION_STEPS [NUM_INTERPOLATION_STEPS ...]
number of steps between each image (default: None)
--output_dir OUTPUT_DIR
output directory (default: dreams)
--name NAME output sub-directory (default: None)
--fps FPS frames per second (default: 10)
--guidance_scale GUIDANCE_SCALE
diffusion guidance scale (default: 7.5)
--num_inference_steps NUM_INFERENCE_STEPS
number of diffusion inference steps (default: 50)
--height HEIGHT output image height (default: 512)
--width WIDTH output image width (default: 512)
--upsample upscale x4 using Real-ESRGAN (default: False)
--batch_size BATCH_SIZE
batch size (default: 1)
--audio_filepath AUDIO_FILEPATH
path to audio file (default: None)
--audio_offsets AUDIO_OFFSETS [AUDIO_OFFSETS ...]
audio offset for each prompt (default: None)
--negative_prompt NEGATIVE_PROMPT
negative prompt (one for all images) (default: None)
--cfg CFG yaml config file (overwrites other options) (default: None)
The user can also directly provide a YAML configuration file containing all the arguments to overwrite using python scripts/make_video.py --cfg <config_file>
. The file should contain fields with the same name as the arguments.
The script is the same whether the user wants to add audio or not. If the user wants to add audio, he should provide the --audio_filepath
and --audio_offsets
arguments.
In my opinion, this deprecates examples/make_music_video.py
. That file seems to be broken anyway (see #150). If the purpose of that script is to serve as a code example, then the snippets in README.md
are currently doing a better job. If its purpose is to have a standalone script ready to run from the command line, then this PR implements that and more.
Updated README.md
with an example.
Maybe worth noting, but the batch_size option set to anything but 1 is going to break on mps.
Maybe worth noting, but the batch_size option set to anything but 1 is going to break on mps.
Right. We could hard set batch_size=1
with MPS and raise a warning in case the user provided anything different.
Still haven't started working in applying the suggested changes, will do it soon
No rush :) whenever you get to it. I appreciate your contributions ❤️