Long-Range-Grouping-Transformer
Long-Range-Grouping-Transformer copied to clipboard
LGA grouping strategy
Thank you for your excellent work. While studying your work, I was not very familiar with the implementation of LGA grouping code. Could you please explain to me what these dimensional changes represent and how they reflect the grouping of different blocks ‘’‘ q_cls, k_cls, v_cls = map(lambda t: rearrange(t, '(b v) h (t1 s1 t2 s2) d -> b (s1 s2) h (t1 t2 v) d', b=b_s, v=view_num, t1=token_num, s1=self.group_num, t2=token_num, s2=self.group_num), (q, k, v)) ’‘’ looking for your reply!