video-diffusion-pytorch icon indicating copy to clipboard operation
video-diffusion-pytorch copied to clipboard

Duplicate dividing in relative positional encoding

Open songweige opened this issue 2 years ago • 1 comments

Hey @lucidrains, thanks for keeping these models implemented. In line 88 https://github.com/lucidrains/video-diffusion-pytorch/blob/f55f1b0824b1be7d2bb555ed7a5d612eff8ad5d0/video_diffusion_pytorch/video_diffusion_pytorch.py#L84-L88 you have max_exact as the half of num_buckets, whose value was already halved in line 84.

I think that is duplicated and should be changed to identity:

 max_exact = num_buckets

songweige avatar Jun 15 '22 23:06 songweige

I suggest you read the paper "On Scalar Embedding of Relative Positions in Attention Models". In that paper, they explain the implemented bucketing function.

oxjohanndiep avatar Jul 02 '22 06:07 oxjohanndiep