FCHarDNet
FCHarDNet copied to clipboard
about IoU with different size from cityscape,please help me!
Because of the camera output size is 640*360, so I changed the size of cityscapes dataset. Then I use the project code to train, but I can't get the good IoU about 75%, please help me , how can get the good IoU result. the train log is: INFO:ptsemseg:Iter [90000/90000] Loss: 0.9804 Time/Image: 0.0165 lr=0.090953 11it [00:05, 2.08it/s] INFO:ptsemseg:Iter 90000 Val Loss: 1.1422 Overall Acc: 0.902127139034985 INFO:ptsemseg:Overall Acc: : 0.902127139034985 Mean Acc : 0.6061411175875274 INFO:ptsemseg:Mean Acc : : 0.6061411175875274 FreqW Acc : 0.8350029456995808 INFO:ptsemseg:FreqW Acc : : 0.8350029456995808 Mean IoU : 0.48783643289422235 INFO:ptsemseg:Mean IoU : : 0.48783643289422235 INFO:ptsemseg:0: 0.949375256023792 INFO:ptsemseg:1: 0.6434965931893878 INFO:ptsemseg:2: 0.8355899486279862 INFO:ptsemseg:3: 0.36742551411680824 INFO:ptsemseg:4: 0.24889070738206942 INFO:ptsemseg:5: 0.31642120260993717 INFO:ptsemseg:6: 0.28094440896774303 INFO:ptsemseg:7: 0.42495380920802917 INFO:ptsemseg:8: 0.8470327817169933 INFO:ptsemseg:9: 0.4534516073428247 INFO:ptsemseg:10: 0.8743228190634068 INFO:ptsemseg:11: 0.4587472314322872 INFO:ptsemseg:12: 0.310287655352762 INFO:ptsemseg:13: 0.8298468752303512 INFO:ptsemseg:14: 0.42321140100466925 INFO:ptsemseg:15: 0.39785631228298224 INFO:ptsemseg:16: 0.031675380114154154 INFO:ptsemseg:17: 0.09916433094570747 INFO:ptsemseg:18: 0.4761983903783339
and the hardnet.yml is:
model:
arch: hardnet
data:
dataset: cityscapes
train_split: train
val_split: val
img_rows: 360
img_cols: 640
path: ../cityscape_transformation/
sbd_path: ../cityscape_transformation/
training:
train_iters: 90000
batch_size: 48
val_interval: 500
n_workers: 8
print_interval: 10
augmentations:
hflip: 0.5
rscale_crop: [360, 360]
optimizer:
name: 'sgd'
lr: 0.1
weight_decay: 0.0005
momentum: 0.9
loss:
name: 'bootstrapped_cross_entropy'
min_K: 4096
loss_th: 0.3
size_average: True
lr_schedule:
name: 'poly_lr'
max_iter: 9000000
resume: None
finetune: None
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
and I only modify the code train.py here:
v_loader = data_loader(
data_path,
is_transform=True,
split=cfg["data"]["val_split"],
img_size=(360,640),
)
and the code cityscapes_loader.py here:
def __init__(
self,
root,
split="train",
is_transform=False,
img_size=(360, 640),
augmentations=None,
img_norm=True,
version="cityscapes",
test_mode=False,
):
"""__init__
Hi, honestly, you can't get a good IoU for such a small input resolution, since this network architecture was designed for 2048x1024 input. You can notice that the first four conv layers have two layers with stride=2, which means the spacial information was quickly shrunk into 512x256, while in your case, it will be 160x90 which is way too small for segmentation. You can try to simply remove the "stride=2" for both or one of the mentioned two conv layers (line #270, #272). The pretrained weight can still be loaded, but the network inference speed will be much slower.
Thank you for your help, I will try it. Thank you!
@electronicYH have you solve the problem to small input image size?