PETR icon indicating copy to clipboard operation
PETR copied to clipboard

where is the temporal fusion code part in petr_v2?

Open lianshushu opened this issue 1 year ago • 6 comments

where is the temporal fusion code part in petr_v2,I only see the difference fpe between v1 and v2

lianshushu avatar Apr 13 '23 08:04 lianshushu

Hello, the temporal fusion in PETRv2 is mainly processed in an offline manner: https://github.com/megvii-research/PETR/blob/e48faec8aa24bdd14f95692428ddc4982f2f71cb/tools/generate_sweep_pkl.py#L53

Recently, we have another version of temporal fusion for PETR, named StreamPETR, which performs better than PETRv2. It processes the temporal fusion totally online: https://github.com/exiawsh/StreamPETR

exiawsh avatar Apr 14 '23 06:04 exiawsh

@exiawsh Hi , would you like to explain more detail? In PETRV2, the loaded data camera=12, after backbone processing, is equivalent to the 2d feature map concat of the t-1 and t frames together, but I don’t understand how to transform from the 3D coordinates of the previous frame to the coordinates of the current frame t through posture transformation

pianogGG avatar May 16 '23 06:05 pianogGG

@exiawsh Hi , would you like to explain more detail? In PETRV2, the loaded data camera=12, after backbone processing, is equivalent to the 2d feature map concat of the t-1 and t frames together, but I don’t understand how to transform from the 3D coordinates of the previous frame to the coordinates of the current frame t through posture transformation Hi, three are 2 design for temporal modeling in PETRv2:

  1. Please refer to the data preparation for temporal alignment in https://github.com/megvii-research/PETR/blob/e48faec8aa24bdd14f95692428ddc4982f2f71cb/tools/generate_sweep_pkl.py#L53 (align the coordinates of previous frames)
  2. The multi-view embedding: https://github.com/megvii-research/PETR/blob/e48faec8aa24bdd14f95692428ddc4982f2f71cb/projects/mmdet3d_plugin/models/dense_heads/petrv2_head.py#LL472C26-L472C35

exiawsh avatar May 16 '23 08:05 exiawsh

@exiawsh When i run test.py,https://github.com/megvii-research/PETR/blob/e48faec8aa24bdd14f95692428ddc4982f2f71cb/projects/mmdet3d_plugin/datasets/pipelines/loading.py#L114 choices=14,and this is not key frame , so I wondering if it can directly use the backbone results of the previous key frame, saving half of the backbone calculations. Is this feasible? Will it affect mAP?

pianogGG avatar May 22 '23 03:05 pianogGG

@exiawsh Thanks a lot!

pianogGG avatar May 22 '23 03:05 pianogGG

It will affect the mAP (the time interval between two frames is important for PETRv2), so PETRv2 consumes more training and testing time, using StreamPETR instead.

exiawsh avatar May 22 '23 03:05 exiawsh