BiSeNet
BiSeNet copied to clipboard
RuntimeError: copy_if failed to synchronize: an illegal memory access was encountered
HI, Thanks for your great work. Now I'm trying to train model based on myself data using single GPU and I already made a few modifications as your readme. But I met the the below issue when starting training. Could you help check it. thanks again.
change picture as log:
bc311@bc311-ai1:/work/xxx/BiSeNet$ python3 tools/train.py --model bisenetv2
loss_pre= tensor(1.2688, device='cuda:0', grad_fn=<MeanBackward0>)
loss_aux= [tensor(9.9736, device='cuda:0', grad_fn=<MeanBackward0>), tensor(3.0100, device='cuda:0', grad_fn=<MeanBackward0>), tensor(5.4277, device='cuda:0', grad_fn=<MeanBackward0>), tensor(3.6156, device='cuda:0', grad_fn=<MeanBackward0>)]
sum= tensor(22.0268, device='cuda:0', grad_fn=<AddBackward0>)
loss= tensor(23.2956, device='cuda:0', grad_fn=<AddBackward0>)
Traceback (most recent call last):
File "tools/train.py", line 240, in
i have the same problems after first epoch. i found i do not change the val path in config file. make sure your Bisenetv1.py file has been changed correctly. like im_root='/home/edge/fjj_workspace/data/img', train_im_anns='/home/edge/fjj_workspace/data/trainJK.txt', val_im_anns='/home/edge/fjj_workspace/data/valJK.txt',
i have the same problems after first epoch. i found i do not change the val path in config file. make sure your Bisenetv1.py file has been changed correctly. like im_root='/home/edge/fjj_workspace/data/img', train_im_anns='/home/edge/fjj_workspace/data/trainJK.txt', val_im_anns='/home/edge/fjj_workspace/data/valJK.txt',
Thanks for your information. seems my case is still different from yours. I meet it when start training for the first batch. And I checked train image path and read, it's no problem.
Hi,
are you using your own dataset or dataset of cityscapes ?
Hi,
are you using your own dataset or dataset of cityscapes ?
It's my own dataset
hi, sorry for replying late. i will check your issue now. i finetune my own dataset
---Original--- From: "ltshan"<[email protected]> Date: Thu, Sep 3, 2020 22:34 PM To: "CoinCheung/BiSeNet"<[email protected]>; Cc: "jiaji-fang"<[email protected]>;"Comment"<[email protected]>; Subject: Re: [CoinCheung/BiSeNet] RuntimeError: copy_if failed to synchronize: an illegal memory access was encountered (#80)
Hi,
are you using your own dataset or dataset of cityscapes ?
It's my own dataset
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
How many categories are there in your own dataset? Are u using the dataset class designed for cityscapes or implemented a new dataset class ?
Please notice that training labels of cityscapes are mapped from the label images pixels according to the specification. See this: https://github.com/CoinCheung/BiSeNet/blob/aa3876b4b1f2c430e07678f8c15b96465681fca0/lib/base_dataset.py#L44
there are 3 classes, including background for my dataset. it's pascal voc format. how to set class number in config file? and by my check, self.lb_map is NOT none, how to change it for my dataset?
thanks
Hello@CoinCheung, I met this error, could you have any ideas?
Traceback (most recent call last):
File "D:/GitHub/BiSeNet/tools/train_amp.py", line 219, in
@miscedence12 Did you check your dataset? You label range?
I am closing this, since the problem is likely to have been solved.