Open-Sora 和想象中的效果差太多了，是我的哪里没设置好吗

用的t2v_sora sample

Mar 18 '24 11:03 wings0820

请问一下能share一下你的cli的命令吗，我觉得可能是我的环境的错误

Mar 18 '24 11:03 killnice66

https://github.com/hpcaitech/Open-Sora/issues/117 不知道你有没有碰到过这种bug呢

Mar 18 '24 11:03 killnice66

As the README.md says:

⚠️ LIMITATION: Our model is trained on a limited budget. The quality and text alignment is relatively poor. The model performs badly especially on generating human beings and cannot follow detailed instructions. We are working on improving the quality and text alignment.

So do not try to generate human beings.

Mar 18 '24 12:03 kyww

用了多大的卡推理

Mar 18 '24 13:03 hertz-pj

请问你是用什么卡，推理的？谢谢

Mar 18 '24 18:03 wenter

#116 请问你有没有遇到这个问题。apex安装报错是怎么解决的？我的cuda是12.0

Mar 19 '24 00:03 openchao

你们说的那个bug都是apex这个垃圾引起的，需要安装apex包，我用下载源码的22.04-dev，然后改一下torch._six,编译安装，在配个其他，还有flash_attn这个包，也挺慢的我的配置是ubuntu22.04，A100 80gb，torch 2.0.1 cuda12.2 驱动535 作者估计有更好的权重没有放出来，同样的prompt，我反正没有一个像演示的那种效果

Mar 19 '24 02:03 wings0820

请问你有没有遇过这个问题，报错Missing keys: ['pos_embed', 'pos_embed_temporal'] [03/18/24 05:16:36] INFO colossalai - colossalai - INFO: /home/aiscuser/.conda/envs/opensora/lib/python3.10/site-packages/colossalai/initialize.py:6 7 launch INFO colossalai - colossalai - INFO: Distributed environment is initialized, world size: 1 ./pretrained_models/t5_ckpts/t5-v1_1-xxl Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████| 2/2 [00:24<00:00, 12.07s/it] Missing keys: ['pos_embed', 'pos_embed_temporal'] Unexpected keys: []

Mar 19 '24 04:03 happy-jing

这效果感觉回到原始社会了

Mar 19 '24 06:03 shineway14

你们说的那个bug都是apex这个垃圾引起的，需要安装apex包，我用下载源码的22.04-dev，然后改一下torch._six,编译安装，在配个其他，还有flash_attn这个包，也挺慢的我的配置是ubuntu22.04，A100 80gb，torch 2.0.1 cuda12.2 驱动535 作者估计有更好的权重没有放出来，同样的prompt，我反正没有一个像演示的那种效果

您好请问您在安装apex时有没有遇到无法安装的问题。能否交流一下。QQ：1319609405.

Mar 19 '24 08:03 openchao

https://github.com/hpcaitech/Open-Sora/assets/4608210/db9ce689-3277-43bb-9eff-115ee45cad9c

用项目中的t2v 跑了10个视频，这个算是还能看出来是什么的。求助：如何优化效果

{'num_frames': 64, 'fps': 12, 'image_size': (512, 512), 'model': {'type': 'STDiT-XL/2', 'space_scale': 1.0, 'time_scale': 0.6666666666666666, 'enable_flashattn': True, 'enable_layernorm_kernel': True, 'from_pretrained': './OpenSora-v1-HQ-16x512x512.pth'}, 'vae': {'type': 'VideoAutoencoderKL', 'from_pretrained': './pretrained_models/stabilityai/sd-vae-ft-ema/', 'micro_batch_size': 128}, 'text_encoder': {'type': 't5', 'from_pretrained': './pretrained_models/t5_ckpts/', 'model_max_length': 120}, 'scheduler': {'type': 'iddpm', 'num_sampling_steps': 100, 'cfg_scale': 7.0}, 'dtype': 'fp16', 'batch_size': 1, 'seed': 42, 'prompt_path': './assets/texts/t2v_samples.txt', 'save_dir': './outputs/samples/', 'multi_resolution': False}

Mar 21 '24 06:03 g711ab

样本_9.mp4 用项目中的t2v跑了10个视频，这个其实还能看出来是什么的。求助：如何优化效果

{'num_frames': 64, 'fps': 12, 'image_size': (512, 512), 'model': {'type': 'STDiT-XL/2', 'space_scale': 1.0, 'time_scale': 0.6666666666666666，'enable_flashattn'：True，'enable_layernorm_kernel'：True，'from_pretrained'：'./OpenSora-v1-HQ-16x512x512.pth'}，'vae'：{'type'：'VideoAutoencoderKL'，'from_pretrained'： './pretrained_models/stabilityai/sd-vae-ft-ema/', 'micro_batch_size': 128}, 'text_encoder': {'type': 't5', 'from_pretrained': './pretrained_models/t5_ckpts/', 'model_max_length'：120}，'调度程序'：{'type'：'iddpm'，'num_sampling_steps'：100，'cfg_scale'：7.0}，'dtype'：'fp16'，'batch_size'：1，'种子' : 42, 'prompt_path': './assets/texts/t2v_samples.txt', 'save_dir': './outputs/samples/', 'multi_resolution': False}

你好，可以多提供几张效果图么，十分感谢！

Mar 21 '24 07:03 1030zero

你们说的那个bug都是apex这个垃圾引起的，需要安装apex包，我用下载源码的22.04-dev，然后改一下torch._six,编译安装，在配个其他，还有flash_attn这个包，也挺慢的我的配置是ubuntu22.04，A100 80gb，torch 2.0.1 cuda12.2 驱动535 作者估计有更好的权重没有放出来，同样的prompt，我反正没有一个像演示的那种效果

torch._six是改什么能告诉下么

Apr 04 '24 09:04 Weixiang-Sun

你们说的那个bug都是apex这个垃圾引起的，需要安装apex包，我用下载源码的22.04-dev，然后改一下torch._six,编译安装，在配个其他，还有flash_attn这个包，也挺慢的我的配置是ubuntu22.04，A100 80gb，torch 2.0.1 cuda12.2 驱动535 作者估计有更好的权重没有放出来，同样的prompt，我反正没有一个像演示的那种效果

我们生成的 demo 就是我们放出来的权重，可能是哪里设置还有问题，我们自己测试的结果都是正常的，画质应该是清晰的。如果 apex 安装有问题，可以不按照，在 config 中设置 enable_flash_attn=False。

May 10 '24 07:05 zhengzangw

sample_9.mp4 用项目中的t2v 跑了10个视频，这个算是还能看出来是什么的。求助：如何优化效果

{'num_frames': 64, 'fps': 12, 'image_size': (512, 512), 'model': {'type': 'STDiT-XL/2', 'space_scale': 1.0, 'time_scale': 0.6666666666666666, 'enable_flashattn': True, 'enable_layernorm_kernel': True, 'from_pretrained': './OpenSora-v1-HQ-16x512x512.pth'}, 'vae': {'type': 'VideoAutoencoderKL', 'from_pretrained': './pretrained_models/stabilityai/sd-vae-ft-ema/', 'micro_batch_size': 128}, 'text_encoder': {'type': 't5', 'from_pretrained': './pretrained_models/t5_ckpts/', 'model_max_length': 120}, 'scheduler': {'type': 'iddpm', 'num_sampling_steps': 100, 'cfg_scale': 7.0}, 'dtype': 'fp16', 'batch_size': 1, 'seed': 42, 'prompt_path': './assets/texts/t2v_samples.txt', 'save_dir': './outputs/samples/', 'multi_resolution': False}

你使用的权重应该只支持 num_frames=16

May 10 '24 07:05 zhengzangw