2DPASS icon indicating copy to clipboard operation
2DPASS copied to clipboard

Testing results on SemanticKITTI

Open bobpop1 opened this issue 2 years ago • 36 comments

Thank you for your great work! We achieve the claimed results on validation set. However, we only achieves 68.2 mIoU on the test set, which is much lower than the claimed 72.9.

bobpop1 avatar Oct 06 '22 05:10 bobpop1

Hi @bobpop1, could you share with me what scores you got on the validation set? Thanks a lot!

ldkong1205 avatar Oct 06 '22 07:10 ldkong1205

Hi @bobpop1, this checkpoint is gained by training only on the training set, and the results on the test set need to the model training on both training and validation set.

yanx27 avatar Oct 06 '22 12:10 yanx27

Hi @bobpop1, this checkpoint is gained by training only on the training set, and the results on the test set need to the model training on both training and validation set.

Thanks for your reply. How do you choose the best checkpoint when train on both training and validation set?

bobpop1 avatar Oct 06 '22 13:10 bobpop1

Hi @bobpop1, could you share with me what scores you got on the validation set? Thanks a lot!

The best miou is 69.023

bobpop1 avatar Oct 06 '22 13:10 bobpop1

Hi @bobpop1, could you share with me what scores you got on the validation set? Thanks a lot!

The best miou is 69.023

Thank you for your information!

ldkong1205 avatar Oct 06 '22 13:10 ldkong1205

@bobpop1 Since we use cosine learning rate scheduler, you can just choose the best or final checkpoint during the training. By the way, I found out that you only gain 69.023 mIoU on validation set, which is about 1% lower than ours. Did you correctly use TTA during the inference?

yanx27 avatar Oct 07 '22 02:10 yanx27

@bobpop1 Since we use cosine learning rate scheduler, you can just choose the best or final checkpoint during the training. By the way, I found out that you only gain 69.023 mIoU on validation set, which is about 1% lower than ours. Did you correctly use TTA during the inference?

I get only 66.03 when test with num_vote=1 and 68.88 with num_vote=12. I also evaluate the weight you released through google drive and got 68.5 with num_vote=1 and 70.05 with num_vote=12. Is the model sensitive to the version of spconv?

The envs are listed as follow: python 3.9.13 pytorch 1.12.0 py3.9_cuda11.3_cudnn8.3.2_0 pytorch-lightning 1.3.8 torchmetrics 0.5 torch-scatter 2.0.9 spconv-cu114 2.2.3 as spconv-cu114 is much faster than spconv-cu111 in the training process

The training process is based on a single RTX-A6000 with cuda=11.4. I follow the default setting but cannot achieve the same performance(68.5) with you. Do you have any advice? Thanks in advance!

isunLt avatar Oct 07 '22 06:10 isunLt

@isunLt Hi, I tested this code with pytorch 1.8 and spconv-cu111. It achieves 66.5 with num_vote=1 and 69.3 with num_vote=12, as reported in the paper. The results of 70.05 is gained by fine-tuning the model with more epochs (see README). The difference (68.88 v.s. 63.9) may cause by the running environment.

yanx27 avatar Oct 07 '22 06:10 yanx27

@bobpop1 Since we use cosine learning rate scheduler, you can just choose the best or final checkpoint during the training. By the way, I found out that you only gain 69.023 mIoU on validation set, which is about 1% lower than ours. Did you correctly use TTA during the inference?

Thank you for your reply. I use the default setting of TTA. And what is the correct setting? Can you show me more details?

bobpop1 avatar Oct 07 '22 06:10 bobpop1

@bobpop1

@bobpop1 Since we use cosine learning rate scheduler, you can just choose the best or final checkpoint during the training. By the way, I found out that you only gain 69.023 mIoU on validation set, which is about 1% lower than ours. Did you correctly use TTA during the inference?

Thank you for your reply. I use the default setting of TTA. And what is the correct setting? Can you show me more details?

Did you set `--num_vote=12' in the testing?

yanx27 avatar Oct 07 '22 06:10 yanx27

@isunLt Hi, I tested this code with pytorch 1.8 and spconv-cu111. It achieves 66.5 with num_vote=1 and 69.3 with num_vote=12, as reported in the paper. The results of 70.05 is gained by fine-tuning the model with more epochs (see README). The difference (68.88 v.s. 63.9) may cause by the running environment.

I see! Thanks for your quick respondence!

isunLt avatar Oct 07 '22 06:10 isunLt

By the

@bobpop1

@bobpop1 Since we use cosine learning rate scheduler, you can just choose the best or final checkpoint during the training. By the way, I found out that you only gain 69.023 mIoU on validation set, which is about 1% lower than ours. Did you correctly use TTA during the inference?

Thank you for your reply. I use the default setting of TTA. And what is the correct setting? Can you show me more details?

Did you set `--num_vote=12' in the testing?

Yes, we have set '--num_vote=12'. By the way, there is another question. I found that the code is slower than the claimed results. What running environment do you use to achieve faster speed?

bobpop1 avatar Oct 07 '22 06:10 bobpop1

@bobpop1

@bobpop1 Since we use cosine learning rate scheduler, you can just choose the best or final checkpoint during the training. By the way, I found out that you only gain 69.023 mIoU on validation set, which is about 1% lower than ours. Did you correctly use TTA during the inference?

Thank you for your reply. I use the default setting of TTA. And what is the correct setting? Can you show me more details?

Did you set `--num_vote=12' in the testing?

Yes, we have set '--num_vote=12'. By the way, there is another question. I found that the code is slower than the claimed results. What running environment do you use to achieve faster speed?

I see. If 69.023 is gained by training with 64 epochs, it is very closed to ours. For the network speed, the default setting will consider the time of loading data, which is much time-consuming. The inference time in our paper is gained by the pure time for the network inference. You can follow the below setting to test the speed.

torch.cuda.synchronize()
start = time.time()
model(data_dict)
torch.cuda.synchronize()
end = time.time()
print("inference time:", end - start )

yanx27 avatar Oct 07 '22 07:10 yanx27

@isunLt Hi, I tested this code with pytorch 1.8 and spconv-cu111. It achieves 66.5 with num_vote=1 and 69.3 with num_vote=12, as reported in the paper. The results of 70.05 is gained by fine-tuning the model with more epochs (see README). The difference (68.88 v.s. 63.9) may cause by the running environment.

Have you tried train semantickiiti with multiple gpus? Are there any difference in the config compared with training with a single gpu?

isunLt avatar Oct 07 '22 07:10 isunLt

@isunLt Hi, I tested this code with pytorch 1.8 and spconv-cu111. It achieves 66.5 with num_vote=1 and 69.3 with num_vote=12, as reported in the paper. The results of 70.05 is gained by fine-tuning the model with more epochs (see README). The difference (68.88 v.s. 63.9) may cause by the running environment.

Have you tried train semantickiiti with multiple gpus? Are there any difference in the config compared with training with a single gpu?

@isunLt Hi, we haven't tried multiple gpus on semantickitti.

yanx27 avatar Oct 07 '22 07:10 yanx27

@isunLt Hi, I tested this code with pytorch 1.8 and spconv-cu111. It achieves 66.5 with num_vote=1 and 69.3 with num_vote=12, as reported in the paper. The results of 70.05 is gained by fine-tuning the model with more epochs (see README). The difference (68.88 v.s. 63.9) may cause by the running environment.

Have you tried train semantickiiti with multiple gpus? Are there any difference in the config compared with training with a single gpu?

@isunLt Hi, we haven't tried multiple gpus on semantickitti.

I see.

isunLt avatar Oct 07 '22 07:10 isunLt

Hi @bobpop1, this checkpoint is gained by training only on the training set, and the results on the test set need to the model training on both training and validation set.

Thanks for your reply. How do you choose the best checkpoint when train on both training and validation set?

Hi @yanx27, I have a similar question as @bobpop1. Training on both val and train sets and validating on val set will make the model fit the val set. The validation results on val should become much higher. What scores did you obtain under this setting? Were you training with the val set from the beginning or you just fine-tuned the model on val after training on train? Thanks

KIM-5-WEE-8 avatar Oct 07 '22 08:10 KIM-5-WEE-8

Hi @bobpop1, this checkpoint is gained by training only on the training set, and the results on the test set need to the model training on both training and validation set.

Thanks for your reply. How do you choose the best checkpoint when train on both training and validation set?

Hi @yanx27, I have a similar question as @bobpop1. Training on both val and train sets and validating on val set will make the model fit the val set. The validation results on val should become much higher. What scores did you obtain under this setting? Were you training with the val set from the beginning or you just fine-tuned the model on val after training on train? Thanks

@KIM-5-WEE-8 We directly train from scratch on both training and validation sets. As you mentioned, the results on validation set is meaningless, since the model is already fitted on it. However, since we use cosine learning rate scheduler, the best model is easily gained by the last checkpoints (see the mIoU curve on validation set in https://github.com/yanx27/2DPASS/issues/11). Also, you can gain the higher mIoU on test set if further fine-tune the model.

yanx27 avatar Oct 09 '22 04:10 yanx27

@bobpop1

@bobpop1 Since we use cosine learning rate scheduler, you can just choose the best or final checkpoint during the training. By the way, I found out that you only gain 69.023 mIoU on validation set, which is about 1% lower than ours. Did you correctly use TTA during the inference?

Thank you for your reply. I use the default setting of TTA. And what is the correct setting? Can you show me more details?

Did you set `--num_vote=12' in the testing?

Yes, we have set '--num_vote=12'. By the way, there is another question. I found that the code is slower than the claimed results. What running environment do you use to achieve faster speed?

I see. If 69.023 is gained by training with 64 epochs, it is very closed to ours. For the network speed, the default setting will consider the time of loading data, which is much time-consuming. The inference time in our paper is gained by the pure time for the network inference. You can follow the below setting to test the speed.

torch.cuda.synchronize()
start = time.time()
model(data_dict)
torch.cuda.synchronize()
end = time.time()
print("inference time:", end - start )

@bobpop1

@bobpop1 Since we use cosine learning rate scheduler, you can just choose the best or final checkpoint during the training. By the way, I found out that you only gain 69.023 mIoU on validation set, which is about 1% lower than ours. Did you correctly use TTA during the inference?

Thank you for your reply. I use the default setting of TTA. And what is the correct setting? Can you show me more details?

Did you set `--num_vote=12' in the testing?

Yes, we have set '--num_vote=12'. By the way, there is another question. I found that the code is slower than the claimed results. What running environment do you use to achieve faster speed?

I see. If 69.023 is gained by training with 64 epochs, it is very closed to ours. For the network speed, the default setting will consider the time of loading data, which is much time-consuming. The inference time in our paper is gained by the pure time for the network inference. You can follow the below setting to test the speed.

torch.cuda.synchronize()
start = time.time()
model(data_dict)
torch.cuda.synchronize()
end = time.time()
print("inference time:", end - start )

Thanks for your reply. And I found that there is the 'instance_label' in dataloader. How do you obtain the 'instance_label'?

bobpop1 avatar Oct 09 '22 05:10 bobpop1

Thanks for your reply. And I found that there is the 'instance_label' in dataloader. How do you obtain the 'instance_label'?

Instance label is gained here. For the instance level augmentation, you can refer to RPVNet.

yanx27 avatar Oct 11 '22 00:10 yanx27

Thanks for your reply. And I found that there is the 'instance_label' in dataloader. How do you obtain the 'instance_label'?

Instance label is gained here. For the instance level augmentation, you can refer to RPVNet.

Thank you very much. We train the model on training and validation sets. The best val miou is 81.043. And we upload the last checkpoint and the best checkpoint to codalab. However, the mIoU is 65 and 69, respectively.

bobpop1 avatar Oct 11 '22 13:10 bobpop1

Thanks for your reply. And I found that there is the 'instance_label' in dataloader. How do you obtain the 'instance_label'?

Instance label is gained here. For the instance level augmentation, you can refer to RPVNet.

Thank you very much. We train the model on training and validation sets. The best val miou is 81.043. And we upload the last checkpoint and the best checkpoint to codalab. However, the mIoU is 65 and 69, respectively.

Hi, according to your results, you may ignore modifying training size before training on the additional valid set. In practice, the last checkpoint will very close to the best one. If you use the original training size, the model will achieve the minimum learning rate in earlier epochs, and cannot converge in the last one. You can also set lr_scheduler to CosineAnnealingLR if you don't want to change the training size.

yanx27 avatar Oct 13 '22 01:10 yanx27

Thanks for your reply. And I found that there is the 'instance_label' in dataloader. How do you obtain the 'instance_label'?

Instance label is gained here. For the instance level augmentation, you can refer to RPVNet.

Thank you very much. We train the model on training and validation sets. The best val miou is 81.043. And we upload the last checkpoint and the best checkpoint to codalab. However, the mIoU is 65 and 69, respectively.

Hi, according to your results, you may ignore modifying training size before training on the additional valid set. In practice, the last checkpoint will very close to the best one. If you use the original training size, the model will achieve the minimum learning rate in earlier epochs, and cannot converge in the last one. You can also set lr_scheduler to CosineAnnealingLR if you don't want to change the training size.

Thank you very much! As I know, there are 19130 scans in the training set. Why do you set the training_size 19132?

bobpop1 avatar Oct 13 '22 02:10 bobpop1

Thanks for your reply. And I found that there is the 'instance_label' in dataloader. How do you obtain the 'instance_label'?

Instance label is gained here. For the instance level augmentation, you can refer to RPVNet.

Thank you very much. We train the model on training and validation sets. The best val miou is 81.043. And we upload the last checkpoint and the best checkpoint to codalab. However, the mIoU is 65 and 69, respectively.

Hi, according to your results, you may ignore modifying training size before training on the additional valid set. In practice, the last checkpoint will very close to the best one. If you use the original training size, the model will achieve the minimum learning rate in earlier epochs, and cannot converge in the last one. You can also set lr_scheduler to CosineAnnealingLR if you don't want to change the training size.

Thank you very much! As I know, there are 19130 scans in the training set. Why do you set the training_size 19132?

We use the same config as SPVNAS.

yanx27 avatar Oct 13 '22 02:10 yanx27

From Fig.6 in your paper, I observe that the AF2S3Net achives 84.4 mIoU in the range of 0-10m. To our knowledge, the source code of AF2S3Net is not released, how do you obtain this results?

bobpop1 avatar Oct 15 '22 05:10 bobpop1

Thanks for your reply. And I found that there is the 'instance_label' in dataloader. How do you obtain the 'instance_label'?

Instance label is gained here. For the instance level augmentation, you can refer to RPVNet.

Thank you very much. We train the model on training and validation sets. The best val miou is 81.043. And we upload the last checkpoint and the best checkpoint to codalab. However, the mIoU is 65 and 69, respectively.

Hi, according to your results, you may ignore modifying training size before training on the additional valid set. In practice, the last checkpoint will very close to the best one. If you use the original training size, the model will achieve the minimum learning rate in earlier epochs, and cannot converge in the last one. You can also set lr_scheduler to CosineAnnealingLR if you don't want to change the training size.

We follow the above settings to change the training size as 23203 and thus the last checkpoint is close to the best one. However, the best val miou is about 81.0 yet. The best test miou is 68.

bobpop1 avatar Oct 16 '22 06:10 bobpop1

Thanks for your reply. And I found that there is the 'instance_label' in dataloader. How do you obtain the 'instance_label'?

Instance label is gained here. For the instance level augmentation, you can refer to RPVNet.

Thank you very much. We train the model on training and validation sets. The best val miou is 81.043. And we upload the last checkpoint and the best checkpoint to codalab. However, the mIoU is 65 and 69, respectively.

Hi, according to your results, you may ignore modifying training size before training on the additional valid set. In practice, the last checkpoint will very close to the best one. If you use the original training size, the model will achieve the minimum learning rate in earlier epochs, and cannot converge in the last one. You can also set lr_scheduler to CosineAnnealingLR if you don't want to change the training size.

We follow the above settings to change the training size as 23203 and thus the last checkpoint is close to the best one. However, the best val miou is about 81.0 yet. The best test miou is 68.

According to your results, using additional validation set seems to even reduce the performance (68.2 v.s. 68), which is different from our observation. Currently, we are very busy in the incoming deadline, and the codes and pre-trained model on benchmark need to appear in the future after everything prepared well.

yanx27 avatar Oct 17 '22 00:10 yanx27

Thanks for your reply. And I found that there is the 'instance_label' in dataloader. How do you obtain the 'instance_label'?

Instance label is gained here. For the instance level augmentation, you can refer to RPVNet.

Thank you very much. We train the model on training and validation sets. The best val miou is 81.043. And we upload the last checkpoint and the best checkpoint to codalab. However, the mIoU is 65 and 69, respectively.

Hi, according to your results, you may ignore modifying training size before training on the additional valid set. In practice, the last checkpoint will very close to the best one. If you use the original training size, the model will achieve the minimum learning rate in earlier epochs, and cannot converge in the last one. You can also set lr_scheduler to CosineAnnealingLR if you don't want to change the training size.

We follow the above settings to change the training size as 23203 and thus the last checkpoint is close to the best one. However, the best val miou is about 81.0 yet. The best test miou is 68.

According to your results, using additional validation set seems to even reduce the performance (68.2 v.s. 68), which is different from our observation. Currently, we are very busy in the incoming deadline, and the codes and pre-trained model on benchmark need to appear in the future after everything prepared well.

Could you share me with the miou score which trains on both training and validation set and test on the test set?

bobpop1 avatar Oct 17 '22 12:10 bobpop1

@bobpop1 Hi, I want to ask if your epoch is 64 on training? My training model uses the default 64 epoch, which only reaches about 65.4mIou on the val set.

ZHUANGMINGXI avatar Oct 18 '22 13:10 ZHUANGMINGXI

@isunLt Hi, I tested this code with pytorch 1.8 and spconv-cu111. It achieves 66.5 with num_vote=1 and 69.3 with num_vote=12, as reported in the paper. The results of 70.05 is gained by fine-tuning the model with more epochs (see README). The difference (68.88 v.s. 63.9) may cause by the running environment.

Thanks for your great work! As you mentioned in README, you trained the model for SemanticKITTI with more epochs and thus gained the higher mIoU. Could you please tell me how many epochs you used?

chenst27 avatar Oct 28 '22 14:10 chenst27