TubeViT icon indicating copy to clipboard operation
TubeViT copied to clipboard

Number of Tokens different than papers

Open daniel-code opened this issue 2 years ago • 2 comments

The number of tokens in the paper is 559 tokens (ch4.1), but the number of tokens in my implementation is 539.

  • 8 x 8 x 8 with a stride of (16, 32, 32)
  • 16 x 4 x 4 with a stride of 6 x 32 x 32 and an offset of (4, 8, 8)
  • 4 x 12 x 12 with a stride of 16 x 32 x 32) and an offset of (0, 16, 16)
  • 1 x 16 x 16 with a stride of (32, 16, 16).

For an input of 32 x 224 x 224, this results in only 559 tokens

The number of tokens in implementation

  • 8 x 8 x 8 with a stride of (16, 32, 32) -> 98
  • 16 x 4 x 4 with a stride of 6 x 32 x 32 and an offset of (4, 8, 8) -> 147
  • 4 x 12 x 12 with a stride of 16 x 32 x 32) and an offset of (0, 16, 16) -> 98
  • 1 x 16 x 16 with a stride of (32, 16, 16) -> 196

The total of tokens is 98+147+98+196 = 539

daniel-code avatar Feb 26 '23 07:02 daniel-code