Text-To-Video-Finetuning issues

A typo

[ return rearrange(item / (127.5 - 1.0), 'f h w c -> f c h w')](https://github.com/ExponentialML/Text-To-Video-Finetuning/blob/83e11c702b2fb30248e488bc0a11680cfaa56558/utils/dataset.py#L41C20-L41C74) Change to ```return rearrange(item / 127.5 - 1.0, 'f h w c -> f...

pixeli99

More stable LoRA unet training

ExponentialML

Better Cached Latents

- [x] Allow for multiple cached latents. - [x] Update param to allow shuffling. - [x] Automatically cast to float32. Uses more memory, but encourages better stability. - [x] Allow...

ExponentialML

webui Lora Might be causing errors in checkpoint models.

3

``` Some weights of the model checkpoint were not used when initializing UNet3DConditionModel: This IS expected if you are initializing CLIPTextModel from the checkpoint of a model trained on another...

justinwking

How to test my model

9

Hello sir after training the model then how to test my model giving text as input please help me in this issue

Revanthraja

Transformer2D initializing

1

More of a question really, but do you know why the num_attention_heads and attention_head_dim are opposite when initialising Transformer2D blocks? https://github.com/ExponentialML/Text-To-Video-Finetuning/blob/79e13d17167f66f424a8acad88e83fc76d6d210d/models/unet_3d_blocks.py#L286C17-L286C35 It is opposite in unit_2d_blocks.py https://github.com/huggingface/diffusers/blob/5439e917cacc885c0ac39dda1b8af12258e6e16d/src/diffusers/models/unet_2d_blocks.py#L872

johnmullan

training video

I want to train my own video model, please give me some help How long should I cut each video into? How many frames per video? How many videos are...

cherfffff

Text-To-Video-Finetuning
Text-To-Video-Finetuning copied to clipboard

Metadata

A typo

More stable LoRA unet training

Better Cached Latents

Add support for causal training in time

Add new dataset processors.

Simplify training loop

webui Lora Might be causing errors in checkpoint models.

How to test my model

Transformer2D initializing

training video

← Metadata

Owner

Metadata

Text-To-Video-Finetuning Text-To-Video-Finetuning copied to clipboard

Metadata

← Metadata

Owner

Metadata

Text-To-Video-Finetuning
Text-To-Video-Finetuning copied to clipboard