rail_marking icon indicating copy to clipboard operation
rail_marking copied to clipboard

Unable to train the network

Open AmirAliEbrahimi opened this issue 4 years ago • 8 comments

Hi, Thank you for your awesome project. I downloaded the dataset and try to train it by myself using the train script. but I encounter this error :

File "/export/tmp/ebrahimi/rail_marking/scripts/segmentation/./../../rail_marking/segmentation/models/ohem_ce_loss.py", line 29, in forward loss_hard = loss[loss > self.thresh.to(device)] RuntimeError: CUDA error: an illegal memory access was encountered

Also, For the dataset, I merge all of the jpegs, pngs and jsons and put them in a folder as set it as the --data_path argument of the script. Is it ok?

AmirAliEbrahimi avatar Jan 22 '21 11:01 AmirAliEbrahimi

@AmirAliEbrahimi Can you please clarify how many label classes are you training? For this repo, I already modified the original dataset to a new one with only 3 classes.

If you trained with different number of classes, you need to create a new cfg file, in cfg directory, and replace the num_classes path.

xmba15 avatar Jan 23 '21 03:01 xmba15

here is the logic for the dataloader. in my dataset, the images are of jpg format and groundtruths are of png format only; so I differentiate them using these formats. https://github.com/xmba15/rail_marking/blob/master/rail_marking/segmentation/data_loader/railsem_mask_dataset.py#L51-L55

If your dataset is comprised differently, you need to modify the data loader part accordingly.

xmba15 avatar Jan 23 '21 03:01 xmba15

Thanks for the replay, currently I am using the original RailSem19 and I try to train it with all the classes. so I will try a new cfg file. For ground truths, should I use the 8uC1 label map images provided by the dataset, or use the images annotated by the JSON files?

AmirAliEbrahimi avatar Jan 29 '21 06:01 AmirAliEbrahimi

@AmirAliEbrahimi sorry for the late reply. the ground truth should be 8UC1 label map. please try, if you still have problems with the trainining, maybe I will add the scripts to train the original (not modified) dataset.

xmba15 avatar Mar 05 '21 04:03 xmba15

@xmba15 Thank you for your response, I would appreciate it if you could add the scripts to train the original dataset.

AmirAliEbrahimi avatar Mar 12 '21 16:03 AmirAliEbrahimi

感谢您的重播,目前我正在使用原始的 RailSem19,我尝试用所有类来训练它。所以我会尝试一个新的 cfg 文件。对于基本事实,我应该使用数据集提供的 8uC1 标签地图图像,还是使用 JSON 文件注释的图像?

My dear friend, I am so sorry to disturb you, but I am curious if you have finished all the training. I would be honored if I could learn from your work

lmcggg avatar Apr 22 '23 15:04 lmcggg

@AmirAliEbrahimi Can you please clarify how many label classes are you training? For this repo, I already modified the original dataset to a new one with only 3 classes.

If you trained with different number of classes, you need to create a new cfg file, in cfg directory, and replace the num_classes path.

Hi,i am curious how to modify the original dataset to the new one with 3 classes,I would be very honored if you replied

Zyjhubei avatar Sep 20 '23 04:09 Zyjhubei

@lmcggg @Zyjhubei

Hi, Unfortunately, I didn't modify the dataset or the code at that time, and I don't work on this project anymore. Sorry about that

AmirAliEbrahimi avatar Sep 29 '23 18:09 AmirAliEbrahimi