A-ViT A question about the halting score distribution code

A question about the halting score distribution code

Open DYZhang09 opened this issue 3 years ago • 1 comments

In the paper, the halting score distribution is defined as below:

However, the corresponding code seems wrong. https://github.com/NVlabs/A-ViT/blob/120c9cb90acf86828f1c61dd42c08722aa7173c7/timm/models/act_vision_transformer.py#L464-L465

The shape of h_lst[1] is [B, N], so the code seems to average on the whole batch and ignores the first sample of each batch. I think the right code is: self.halting_score_layer.append(torch.mean(h_lst[1][:, 1:], dim=-1))

Can you tell me which one is correct? Thanks!

Jul 28 '22 03:07 DYZhang09

I have the same question

Nov 24 '22 13:11 Ther-nullptr

A-ViT A-ViT copied to clipboard

A question about the halting score distribution code

A-ViT
A-ViT copied to clipboard