Open-Sora icon indicating copy to clipboard operation
Open-Sora copied to clipboard

len(dataloader) is 0

Open ZXMMD opened this issue 1 year ago • 8 comments

Training command: CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --standalone --nproc_per_node 4 scripts/train.py configs/opensora-v1-2/train/stage1.py --data-path /data/video_meta/meta_clips_info_fmin1_aes_aesmin5.0_ocr.csv --layernorm-kernel False

This is the csv file: image

This is the bucket_config: image

This is code: image But why the length of dataloader is 0? image

ZXMMD avatar Jul 23 '24 06:07 ZXMMD

I have meet the same question

281LinChenjian avatar Jul 24 '24 03:07 281LinChenjian

I have meet the same question

KihongK avatar Jul 24 '24 08:07 KihongK

I have meet the same question

i have solve this problem.you can change buck_config to smaller batch_size,and it can work.

281LinChenjian avatar Jul 24 '24 08:07 281LinChenjian

buck_config

pleas

I have meet the same question

i have solve this problem.you can change buck_config to smaller batch_size,and it can work.

Thank you for reply

but i can't understand whot to change bucket_config to samller batch_size Could you explain it in more detail?

KihongK avatar Jul 24 '24 08:07 KihongK

you can find it in /Open-Sora/configs/opensora-v1-2/train/stage1.py

bucket_config = {  # 12s/it
    "144p": {1: (1.0, 475), 51: (1.0, 2), 102: ((1.0, 0.33), 2), 204: ((1.0, 0.1), 13), 408: ((1.0, 0.1), 6)},
    # ---
    "256": {1: (0.4, 297), 51: (0.5, 2), 102: ((0.5, 0.33), 2), 204: ((0.5, 0.1), 5), 408: ((0.5, 0.1), 2)},
    "240p": {1: (0.3, 1), 51: (0.4, 2), 102: ((0.4, 0.33), 2), 204: ((0.4, 0.1), 1), 408: ((0.4, 0.1), 2)},
    # ---
    "360p": {1: (0.2, 1), 51: (0.15, 2), 102: ((0.15, 0.33), 2), 204: ((0.15, 0.1), 1), 408: ((0.15, 0.1), 1)},
    "512": {1: (0.1, 141)},
    # ---
    "480p": {1: (0.1, 89)},
    # ---
    "720p": {1: (0.05, 36)},
    "1024": {1: (0.05, 36)},
    # ---
    "1080p": {1: (0.1, 5)},
    # ---
    "2048": {1: (0.1, 5)},
}

such as 51:(0.4,2) , 2 is batch_size,you can change orginal batch_size to a smaller number.

281LinChenjian avatar Jul 24 '24 08:07 281LinChenjian

buck_config

pleas

I have meet the same question

i have solve this problem.you can change buck_config to smaller batch_size,and it can work.

Thank you for reply

but i can't understand whot to change bucket_config to samller batch_size Could you explain it in more detail?

在我查看了其他issue的其他解答中,我发现Open-Sora的batch_size是动态变化的,51:(0.4,2)意味着51帧的时候,会采用2的batch_size,这个值原先是8,被我改成2的时候就能正常运行了。大概是因为,原先我们的数据不能装满一个batch_size,因此最dataloader连一个数据都没有,剩下的就是自己实践一下了,我也刚上手。

281LinChenjian avatar Jul 24 '24 08:07 281LinChenjian

This issue is stale because it has been open for 7 days with no activity.

github-actions[bot] avatar Aug 01 '24 01:08 github-actions[bot]

Thanks for your helpful answers!

JThh avatar Aug 07 '24 04:08 JThh

This issue is stale because it has been open for 7 days with no activity.

github-actions[bot] avatar Aug 23 '24 01:08 github-actions[bot]

This issue was closed because it has been inactive for 7 days since being marked as stale.

github-actions[bot] avatar Aug 30 '24 01:08 github-actions[bot]