SystemErrorWang
SystemErrorWang
@calmevtime it seems the function "block_composition" is necessary in UpsampleDCT. would you please upload the dct_resize.py file, and also clarify the meaning of L, M, and N?
could anyone find "dct_resize.py"? it was imported everywhere but can't be found
> > > > V100 32G能不能训起来? > > > > > > > > > deepspeed可以降低stage1到33g左右,两卡。如果用更多的卡可以继续降一点。可以考虑用colossalai来卸载大部分参数和优化器状态到cpu上。但我没有尝试成功。。 > > > > > > 您好。请问您可以提供将stage1降到两卡33G的 deepspeed版的代码吗?非常感谢 > > 可以参考摩尔线程的训练代码,通过accelerate启动deepspeed hello,我用摩尔线程的训练代码,deepspeed训练4步之后会卡住不动。请问您是用他们的代码+deepspeed成功了吗?可以介绍一下是怎么用的吗?
> > you can use deepspeed to reduce gpu memory > > i did, but the process will stuck at backward in 5th step. i was confused same problem,training with...
I succeeded in testing on a single image with the provided pre-trained model, but found severe checkerboard artifacts. It may be because of the image pre-process.
same here, my training got stuck after 4 steps, and showing this info: `[loss_scaler.py:183:update_scale] [deepspeed] OVERFLOW! Rank 0 Skipping step. Attempted loss scale: 536870912, reducing to 268435456 `
> 虽然可以发布我们的 csv,但是我们在处理的过程中,没有记录视频剪切的对应时间戳。这样即使发布了可能大家也没法用,所以暂时先不打算发布了。 同求一个pixels的视屏链接,没有对应时间戳也可以自己下载清洗一下
sorry for late reply, i also found this problem in my training. I suggest you to try deep guided filter as it only use tensorflow default ops and no additional...
same problem here, would be really grateful if anyone has a solution
> @thbupt You can try: Follow the steps: > > conda install -c pytorch pytorch=1.9.1 torchvision cudatoolkit=10.2 conda install -c fvcore -c iopath -c conda-forge fvcore iopath conda install -c...