video-diffusion-pytorch
video-diffusion-pytorch copied to clipboard
Duplicate dividing in relative positional encoding
Hey @lucidrains, thanks for keeping these models implemented. In line 88 https://github.com/lucidrains/video-diffusion-pytorch/blob/f55f1b0824b1be7d2bb555ed7a5d612eff8ad5d0/video_diffusion_pytorch/video_diffusion_pytorch.py#L84-L88 you have max_exact
as the half of num_buckets
, whose value was already halved in line 84.
I think that is duplicated and should be changed to identity:
max_exact = num_buckets
I suggest you read the paper "On Scalar Embedding of Relative Positions in Attention Models". In that paper, they explain the implemented bucketing function.