Text-To-Video-Finetuning
Text-To-Video-Finetuning copied to clipboard
Feature request
Thank you, for making this. It seems to work, and I have a model.
I wanted to ask if there is:
- a link to a repository that we can use to generate videos with our new diffusion models, or a small example on how to do it with python or something like that.
- a way to specify the frame rate of the sample videos. Everything seems to sample at 6-8 fps, so the default 24fps videos seem too fast to really see what the sample video looks like.
- if we use a json file, do we also need to specify the video folder, or does the json's hyperlinks take care of that?
Thank you!
Hi! As for the first point, there's a webui plugin for Auto1111 https://github.com/deforum-art/sd-webui-text2video with a GUI where you can specify anything for your generation. To convert your finetuned models to use in that GUI, use the script in this repo https://github.com/ExponentialML/Text-To-Video-Finetuning/blob/main/utils/convert_diffusers_to_original_ms_text_to_video.py
Thank you kabachuha, for the convert_diffusers_to_original_ms_text_to_video.py, what Arguments do I need to put in? Should I put the root folder of the model, or link directly to the bin files for the Unet and text encoder, and do I need to specify an output folder? Thank you!
python convert_diffusers_to_original_ms_text_to_video.py --model_path path-to-your-diffusers-model-folder --checkpoint_path text2video_pytorch_model.pth --clip_checkpoint_path clip.ckpt. Don't use this clip.ckpt, it's not converted well at the moment, and I need to remove it from requirements
So should I put in the clip checkpoint path and just not used the clip file that is created, or should I leave the clip checkpoint path blank?
@justinwking use this branch for now, before it's merged https://github.com/kabachuha/Text-To-Video-Finetuning/tree/patch-1
Sorry to ask such basic questions..... but I couldn't find the files you suggested I include. So I am guessing they have a different name, at the bottom of this post, I created an interpretation of what I think you meant, please correct me if I am mistaken. If this is my folder structure....
Text to video Fine Tuning
- [ ] Models- Model_scope_diffusers- Scheduler - Text_encoder - Tokenizer - Unet - Vae``- [ ] Outputs- Train 2023….- Cached Latents - CHECKPOINT 2500 - Checkpoint 5000- Scheduler - Text-encoder - Tokenizer - Unet - Vae- Lora - Samples
Does the following command look correct if I do everything from the text_to_finetuning folder....
python .Utils/convert_diffusers_to_original_ms_text_to_video.py --model_path models/model_scope_diffusers/ --checkpoint_path outputs/Train2003…/Lora/5000_unet.pt --clip_checkpoint_path outputs/Train2003…/Lora/5000_text_encoder.pt
Use this folder as models_path "./Outputs/Train 2023…./Checkpoint 5000"
Good morning, I believe I was able to get the script to work with your instructions, but I didn't see a new folder created. What do I need to do to get this into a format and location that t2v can use? All the file names are different, and the folder structures is different. Is this something that the script could do?
I haven't been able to find a readme that explains the process, maybe there is one that I overlooked.
the following was generated when I did the training
Configuration saved in ./outputs\train_2023-04-24T00-05-34\vae\config.json Model weights saved in ./outputs\train_2023-04-24T00-05-34\vae\diffusion_pytorch_model.bin Configuration saved in ./outputs\train_2023-04-24T00-05-34\unet\config.json Model weights saved in ./outputs\train_2023-04-24T00-05-34\unet\diffusion_pytorch_model.bin Configuration saved in ./outputs\train_2023-04-24T00-05-34\scheduler\scheduler_config.json Configuration saved in ./outputs\train_2023-04-24T00-05-34\model_index.json 04/24/2023 06:13:39 - INFO - main - Saved model at ./outputs\train_2023-04-24T00-05-34 on step 10000
then I put in the command,
(text2video-finetune) python ./Utils/convert_diffusers_to_original_ms_text_to_video.py --model_path "./Outputs/train_2023-04-24T00-05-34/Checkpoint-10000"--checkpoint_path "./Outputs/train_2023-04-24T00-05-34/Lora/10000_unet.pt" --clip_checkpoint_path "./Outputs/train_2023-04-24T00-05-34/Lora/10000_text_encoder.pt"
and the process worked, but I don't know where the new UNET is...
Saving UNET Operation successfull
But now.... I don't see anything that looks like the modelscope folder that I am currently using in Automatic1111
configuration.json
open_clip_pytorch_model.bin
README.md
text2video_pytorch_model.pth
VQGAN_autoencoder.pth