HDMapNet icon indicating copy to clipboard operation
HDMapNet copied to clipboard

bug and question

Open FlowEternal opened this issue 3 years ago • 3 comments

感谢提供标签生成和评估脚本 1.Little bug in your devkit data/rasterize.py file in line 22 you have defination: def mask_for_lines(lines, mask, thickness, idx)

while in line 53 and 55 you switch the order of the last two argument: mask_for_lines(new_single_line, map_mask, idx, thickness)

this is not correct and it will influence the data flow of metric evaluation enoumously

2.Some data label generating details the visualization of the label generated based on your orginal script is: image the visualization of the label generated after my modification is: image which is actually more reasonable

3.Question about the papers: (1) I actually read your paper thoroughly in great detail and reimplement your decoder part into my architecture with some changes. But I keep your direction predict part. One confusion I have is that: In your paper you said that you use softmax as the activation function for classification and your gt label are all zero except the two opposite direction, which is 1. I am wondering what kind of loss function you use for direction prediction. Do you use the ordinary cross entropy even though you have two 1's? Or do you use binary cross entropy like what multilabel prediction would usually do? But if later, Then why not use sigmoid as the activation functions.

(2) It can be seen from your script that your bev map is [-30, 30, 0.15] [-15,15,0.15]. Is this also the default settings in Table 2 in your paper? This is not clear in your paper. Also I am wondering the influence of x-y-range and resolution (in your script it is 0.15) on the metric

FlowEternal avatar Nov 12 '21 13:11 FlowEternal

  1. Thanks for pointing out this! We have fixed this on the main branch.
  2. Your modification looks great! Is this only a visualization change or did you also modify the labels? Also, feel free to create a pull request for this modification.
  3. (1) We use BCE loss and one adaptation is that we also "softmax" the label, that is if there are two 1s, then we will normalize them to 0.5. But I guess it would be fine to directly use sigmoid as the activation function. (2) This configuration is used throughout the whole paper. For the second question, a larger perception range should result in worse numerical results because it could be harder for sensors to perceive roads at a long distance. As for the resolution, I think a larger resolution would result in heavier computation costs. Also the solution should not be too small because a road divider won't have a width, say 0.3m.

liqi0126 avatar Jan 11 '22 04:01 liqi0126

Hi @liqi17thu, I find the two adaptations you mentioned in 3.(1) very interesting to think about.

  1. Using softmax instead of sigmoid as activation for prediction I understand this as you still want to encourage the network to predict only 1 class(with high likelihood), rather than in typical multi-label classification setting where the prediction for each class is treated as independent variables. Is that correct?
  2. Normalising the labels I assume you're doing standard normalisation rather than softmax, so this won't change the BCE loss for the classes with label 0, but will change the BCE loss of the 2 classes with label 1 from NLL(red curve below) to (blue curve below), this would mean that the averaged loss is lowest when the network predicts probability 0.5 for both classes with label 1, but relatively high when it predicts probability 1 for one of the 2 correct classes. This seems opposite to the motivation of 1, so I'd love to hear more about your thoughts on the design of this loss, thanks!

Francis777 avatar Jan 13 '22 12:01 Francis777

How you visualize generated label?

Vishal1711 avatar May 05 '22 04:05 Vishal1711