VAD icon indicating copy to clipboard operation
VAD copied to clipboard

Why VADPerceptionTransformer.cams_embeds and level_embeds are tensors with random values ?

Open hollyaoyaozi opened this issue 1 year ago • 0 comments

Hi, dear authors, when i evaluate VAD following official steps and run forward infer, sometimes BEVFormerEncoder's output is tensor with NAN which occurs randomly. Soon i found that the output feature from backbone+FPN will be added with self.cams_embeds and self.level_embeds, which are tensors with random values, and some elements' values are quite large (e+31),and this results in that feature with large values sent into BEVFormerEncoder and it outputs +/-inf and then produces NAN after layernorm. So, i wonder why self.cams_embeds and self.level_embeds (nn.Parameter object) are random tensors instead of that with fixed trained values ? screenshot-20231107-162720

hollyaoyaozi avatar Nov 07 '23 08:11 hollyaoyaozi