HEAL icon indicating copy to clipboard operation
HEAL copied to clipboard

Training and inference details

Open DarrenQu opened this issue 1 year ago • 8 comments
trafficstars

Thank you for your outstanding work. About the data presented in Table 2 of your paper, could you please provide more training and inference details? Like, how did you train the models to achieve the results shown in the table? Are there any pre-trained models available that I can use? image

DarrenQu avatar Jul 11 '24 21:07 DarrenQu

Chinese

我是按照作者提供的 README 进行训练的,因为我使用了 spconv2.x 的版本,所以为并没有使用作者提供的 checkpoint。

我的使用到的训练命令和测试结果如下所示。我很希望您也能将您的训练命令和测试结果分享一下,方便交流。

English

I trained according to the README provided by the author. Since I used the spconv2.x version, I did not use the checkpoint provided by the author.

The training commands and test results I used are shown below. I hope you can also share your training commands and test results for easy communication.

Result (训练结果)

m1 based

stag1

train

CUDA_VISIBLE_DEVICES=1 python opencood/tools/train.py -y None --model_dir opencood/logs/origin_heal/m1_based/stage1

test and result

  • early
CUDA_VISIBLE_DEVICES=1 python opencood/tools/inference.py --model_dir opencood/logs/origin_heal/m1_based/stage1 --fusion_method early
The Average Precision at IOU 0.3 is 0.96, The Average Precision at IOU 0.5 is 0.96, The Average Precision at IOU 0.7 is 0.93
  • intermediate
CUDA_VISIBLE_DEVICES=1 python opencood/tools/inference.py --model_dir opencood/logs/origin_heal/m1_based/stage1 --fusion_method intermediate
The Average Precision at IOU 0.3 is 0.96, The Average Precision at IOU 0.5 is 0.96, The Average Precision at IOU 0.7 is 0.93
  • late
CUDA_VISIBLE_DEVICES=1 python opencood/tools/inference.py --model_dir opencood/logs/origin_heal/m1_based/stage1 --fusion_method late
The Average Precision at IOU 0.3 is 0.96, The Average Precision at IOU 0.5 is 0.96, The Average Precision at IOU 0.7 is 0.92
  • no
CUDA_VISIBLE_DEVICES=1 python opencood/tools/inference.py --model_dir opencood/logs/origin_heal/m1_based/stage1 --fusion_method no
AttributeError: 'IntermediateheterFusionDataset' object has no attribute 'post_process_no_fusion_uncertainty'
  • no_w_uncertainty
CUDA_VISIBLE_DEVICES=1 python opencood/tools/inference.py --model_dir opencood/logs/origin_heal/m1_based/stage1 --fusion_method no_w_uncertainty
AttributeError: 'IntermediateheterFusionDataset' object has no attribute 'post_process_no_fusion_uncertainty'

stag2

train

# m2
CUDA_VISIBLE_DEVICES=1 python opencood/tools/train.py -y None --model_dir opencood/logs/origin_heal/m1_based/stage2/m2

# m3
CUDA_VISIBLE_DEVICES=1 python opencood/tools/train.py -y None --model_dir opencood/logs/origin_heal/m1_based/stage2/m3

# m4
CUDA_VISIBLE_DEVICES=1 python opencood/tools/train.py -y None --model_dir opencood/logs/origin_heal/m1_based/stage2/m4

test and result

  • early
# m2
CUDA_VISIBLE_DEVICES=1 python opencood/tools/inference.py --model_dir opencood/logs/origin_heal/m1_based/stage2/m2 --fusion_method early
The Average Precision at IOU 0.3 is 0.45, The Average Precision at IOU 0.5 is 0.34, The Average Precision at IOU 0.7 is 0.19

# m3
CUDA_VISIBLE_DEVICES=1 python opencood/tools/inference.py --model_dir opencood/logs/origin_heal/m1_based/stage2/m3 --fusion_method early
The Average Precision at IOU 0.3 is 0.78, The Average Precision at IOU 0.5 is 0.77, The Average Precision at IOU 0.7 is 0.68

# m4
CUDA_VISIBLE_DEVICES=1 python opencood/tools/inference.py --model_dir opencood/logs/origin_heal/m1_based/stage2/m4 --fusion_method early
The Average Precision at IOU 0.3 is 0.41, The Average Precision at IOU 0.5 is 0.30, The Average Precision at IOU 0.7 is 0.14
  • intermediate
# m2
CUDA_VISIBLE_DEVICES=1 python opencood/tools/inference.py --model_dir opencood/logs/origin_heal/m1_based/stage2/m2 --fusion_method intermediate
The Average Precision at IOU 0.3 is 0.45, The Average Precision at IOU 0.5 is 0.34, The Average Precision at IOU 0.7 is 0.19

# m3
CUDA_VISIBLE_DEVICES=1 python opencood/tools/inference.py --model_dir opencood/logs/origin_heal/m1_based/stage2/m3 --fusion_method intermediate
The Average Precision at IOU 0.3 is 0.78, The Average Precision at IOU 0.5 is 0.77, The Average Precision at IOU 0.7 is 0.68

# m4
CUDA_VISIBLE_DEVICES=1 python opencood/tools/inference.py --model_dir opencood/logs/origin_heal/m1_based/stage2/m4 --fusion_method intermediate
The Average Precision at IOU 0.3 is 0.41, The Average Precision at IOU 0.5 is 0.30, The Average Precision at IOU 0.7 is 0.14
  • late
# m2
CUDA_VISIBLE_DEVICES=1 python opencood/tools/inference.py --model_dir opencood/logs/origin_heal/m1_based/stage2/m2 --fusion_method late
The Average Precision at IOU 0.3 is 0.82, The Average Precision at IOU 0.5 is 0.70, The Average Precision at IOU 0.7 is 0.47

# m3
CUDA_VISIBLE_DEVICES=1 python opencood/tools/inference.py --model_dir opencood/logs/origin_heal/m1_based/stage2/m3 --fusion_method late
The Average Precision at IOU 0.3 is 0.94, The Average Precision at IOU 0.5 is 0.93, The Average Precision at IOU 0.7 is 0.89

# m4
CUDA_VISIBLE_DEVICES=1 python opencood/tools/inference.py --model_dir opencood/logs/origin_heal/m1_based/stage2/m4 --fusion_method late
The Average Precision at IOU 0.3 is 0.78, The Average Precision at IOU 0.5 is 0.65, The Average Precision at IOU 0.7 is 0.39
  • no
# m2
CUDA_VISIBLE_DEVICES=1 python opencood/tools/inference.py --model_dir opencood/logs/origin_heal/m1_based/stage2/m2 --fusion_method no
The Average Precision at IOU 0.3 is 0.45, The Average Precision at IOU 0.5 is 0.34, The Average Precision at IOU 0.7 is 0.19

# m3
CUDA_VISIBLE_DEVICES=1 python opencood/tools/inference.py --model_dir opencood/logs/origin_heal/m1_based/stage2/m3 --fusion_method no
The Average Precision at IOU 0.3 is 0.45, The Average Precision at IOU 0.5 is 0.34, The Average Precision at IOU 0.7 is 0.19

# m4
CUDA_VISIBLE_DEVICES=1 python opencood/tools/inference.py --model_dir opencood/logs/origin_heal/m1_based/stage2/m4 --fusion_method  no
The Average Precision at IOU 0.3 is 0.41, The Average Precision at IOU 0.5 is 0.30, The Average Precision at IOU 0.7 is 0.14
  • no_w_uncertainty
# m2
CUDA_VISIBLE_DEVICES=1 python opencood/tools/inference.py --model_dir opencood/logs/origin_heal/m1_based/stage2/m2 --fusion_method no_w_uncertainty
TypeError: VoxelPostprocessor.post_process() got an unexpected keyword argument 'return_uncertainty'

# m3
CUDA_VISIBLE_DEVICES=1 python opencood/tools/inference.py --model_dir opencood/logs/origin_heal/m1_based/stage2/m3 --fusion_method no_w_uncertainty
TypeError: VoxelPostprocessor.post_process() got an unexpected keyword argument 'return_uncertainty'

# m4
CUDA_VISIBLE_DEVICES=1 python opencood/tools/inference.py --model_dir opencood/logs/origin_heal/m1_based/stage2/m4 --fusion_method no_w_uncertainty
TypeError: VoxelPostprocessor.post_process() got an unexpected keyword argument 'return_uncertainty'

final infer

python opencood/tools/heal_tools.py merge_final \
  opencood/logs/origin_heal/m1_based/stage2/m2 \
  opencood/logs/origin_heal/m1_based/stage2/m3 \
  opencood/logs/origin_heal/m1_based/stage2/m4 \
  opencood/logs/origin_heal/m1_based/stage1 \
  opencood/logs/origin_heal/m1_based/final_infer/m1m2m3m4

CUDA_VISIBLE_DEVICES=1 python opencood/tools/inference_heter_in_order.py --model_dir opencood/logs/origin_heal/m1_based/final_infer/m1m2m3m4

Chinese-Coding avatar Jul 13 '24 08:07 Chinese-Coding

Screenshot 2024-07-16 at 10 27 39

Checkpoint

Hi! Firstly you can find our checkpoints in HuggingfaceHub. The HEAL models for OPV2V-H and DAIR-V2X are stored as HEAL_OPV2V.zip and HEAL_DAIR.zip, which you can used to infer directly. Just notice they are trained with spconv 1.2.1, not compatible with spconv 2.x.

Inference

To reproduce the result in the table, you will use opencood/tools/inference_heter_in_order.py. It adopts a sequential assignment json file (e.g. modality_assign/opv2v_4modality_in_order.json) to make sure agent types are integrated in order; and control the agent types (as well as the agent number) in the scene via --use_cav.

--use_cav accepts a list. --use_cav [2,3,4] means we will set the agent type number to 2 / 3 / 4 in the scene to do the inferences, which correspond to the experiment table (based on $L_P^{(64)}$, gradually adding $C_E^{(384)}$, $L_S^{(32)}$, $C_R^{(336)}$).

python opencood/tools/inference_heter_in_order.py --model_dir opencood/logs/HEAL_OPV2V/final_infer --use_cav [2,3,4]

Training

Follow the instruction in README.

yifanlu0227 avatar Jul 16 '24 02:07 yifanlu0227

Since @Chinese-Coding find some bugs in the model configuration for SECOND #20 , it is not recommended that you develop new models based on existing checkpoints (spconv 1.2.1). Just take it as a baseline for comparison.

You would enjoy much more convenient installation with spconv 2.x and training from scratch is not that slow.

yifanlu0227 avatar Jul 16 '24 02:07 yifanlu0227

Screenshot 2024-07-16 at 10 27 39 ## Checkpoint Hi! Firstly you can find our checkpoints in [HuggingfaceHub](https://huggingface.co/yifanlu/HEAL/tree/main). The `HEAL` models for OPV2V-H and DAIR-V2X are stored as `HEAL_OPV2V.zip` and `HEAL_DAIR.zip`, which you can used to infer directly. Just notice they are trained with spconv 1.2.1, not compatible with spconv 2.x.

Inference

To reproduce the result in the table, you will use opencood/tools/inference_heter_in_order.py. It adopts a sequential assignment json file (e.g. modality_assign/opv2v_4modality_in_order.json) to make sure agent types are integrated in order; and control the agent types (as well as the agent number) in the scene via --use_cav.

--use_cav accepts a list. --use_cav [2,3,4] means we will set the agent type number to 2 / 3 / 4 in the scene to do the inferences, which correspond to the experiment table (based on LP(64), gradually adding CE(384), LS(32), CR(336)).

python opencood/tools/inference_heter_in_order.py --model_dir opencood/logs/HEAL_OPV2V/final_infer --use_cav [2,3,4]

Training

Follow the instruction in README.

Hello, I have another question, is Table2 obtained by using opv2v_4modality_in_order.json? And, is Table3 configured using opv2v_4modality.json? So ego can be any mode (unlike Table2)? c6a43dbbaecd9fc25659d727111012f

scz023 avatar Aug 02 '24 10:08 scz023

Hi @scz023 The table 3 is also using opv2v_4modality_in_order.json, as stated in the paper

Table 3: Heterogeneous type agents are added in the order presented from left to right in each type combination.

You can get the metrics by

python opencood/tools/inference_heter_in_order.py --model_dir opencood/logs/HEAL_m1_based/final_infer  --use_cav [4] --lidar_degrade

yifanlu0227 avatar Aug 05 '24 09:08 yifanlu0227

Hi @scz023 The table 3 is also using opv2v_4modality_in_order.json, as stated in the paper

Table 3: Heterogeneous type agents are added in the order presented from left to right in each type combination.

You can get the metrics by

python opencood/tools/inference_heter_in_order.py --model_dir opencood/logs/HEAL_m1_based/final_infer  --use_cav [4] --lidar_degrade

Thank you! Beseides, how are the baseline results obtained? Is it to use opencood/hypes_yaml/opv2v/MoreModality/x_modality_end2end_training/xxx.yaml directly for multi-modal end-to-end training, and if so, how are the baseline results in Table1 obtained?

scz023 avatar Aug 06 '24 21:08 scz023

Is it to use opencood/hypes_yaml/opv2v/MoreModality/x_modality_end2end_training/xxx.yaml directly for multi-modal end-to-end training

Yes! You also infer with python opencood/tools/inference_heter_in_order.py --model_dir opencood/logs/<your baseline experiment log>.

Note that the --use_cav argument should align with the modality & model count in the input YAML file. Baselines need to train the new model from scratch when new modality & model is added.

For example,

  1. You train opencood/hypes_yaml/opv2v/MoreModality/3_modality_end2end_training/m1m2m3_attfuse.yaml, you are supposed to use --use_cav [3].
  2. You train opencood/hypes_yaml/opv2v/MoreModality/4_modality_end2end_training/m1m2m3m4_attfuse.yaml, you are supposed to use --use_cav [4].

yifanlu0227 avatar Aug 07 '24 05:08 yifanlu0227

Note that the --use_cav argument should align with the modality & model count in the input YAML file. Baselines need to train the new model from scratch when new modality & model is added.

Got it, thank you!

scz023 avatar Aug 07 '24 06:08 scz023