harry huang

Results 2 issues of harry huang

#2359 There is no retrying in `weed filer.copy` command now.

### 🐛 Describe the bug code at [github](https://github.com/banjiaojuhao/fourcastnet-colossalai) # expected run as expected with GPU0,1(connected with nvlink): ``` $ CUDA_VISIBLE_DEVICES=0,1 colossalai run --nproc_per_node 2 colossal.py WARNING:torch.distributed.run: ***************************************** Setting OMP_NUM_THREADS environment...

bug