Ziyan Chen comments

Results 18 comments of


                                            Ziyan Chen

About training steps and correctness.

Same here. Anyone got solutions?

Running multi-gpu training

The codes use `accelerate` to do DDP automatically.

Running multi-gpu hangs after first step

Caught the same problem here. Multi-GPU training would stuck in step 1 while single-GPU training works fine. I did some debugging. The first step always works fine until the second...

Running multi-gpu hangs after first step

I've done some debugging. I believed that some reasons caused this hanging, such as my linux kernel is too old that it can't support latest version of torch and accelerate,...

Multi GPU Support?

Hey, we've updated for tiled inference to avoid OOM. You can try this new feature to solve the problem.

Model downloading problems: HuggingFace connection error

This seems a HuggingFace connection error. The script downloads pre-trained models from HuggingFace automatically. Try visiting HuggingFace and see if you can access it. If you are working on a...

Model downloading problems: HuggingFace connection error

Hey guys, all the related model should be downloaded automatically if you can access huggingface website. However, if you stuck in huggingface model downloading for the server's internet aceess problem...

Model downloading problems: HuggingFace connection error

> hi，首先谢谢您的解答，我按照您的方式去解决这个问题，但是仍然报错 RuntimeError: Pretrained weights ($/home/hjt/下载/wy/DiffBIR-main/.cache/huggingface/hub/open_clip_pytorch_model.bin) not found for model ViT-H-14. Available pretrained tags (['laion2b_s32b_b79k']. 您可以帮忙解答吗，谢谢。看起来是CLIP模型文件没有下载成功，可以试着重新下载一下

Model downloading problems: HuggingFace connection error

> > > hi，首先谢谢您的解答，我按照您的方式去解决这个问题，但是仍然报错 RuntimeError: Pretrained weights ($/home/hjt/下载/wy/DiffBIR-main/.cache/huggingface/hub/open_clip_pytorch_model.bin) not found for model ViT-H-14. Available pretrained tags (['laion2b_s32b_b79k']. 您可以帮忙解答吗，谢谢。 > > > > > > 看起来是CLIP模型文件没有下载成功，可以试着重新下载一下 > > 谢谢回复，但我尝试了重新下载这个文件，仍然是同样的错误，我是从镜像网站下载的，这会有影响吗？应该没有影响。不过看起来你的路径“$/home/hjt/下载/”似乎有点问题，可以检查一下。

SR scale x1 with OOM: CUDA out of memory

Hey, try tiled options for inference. It can avoid memory limitation.