Roh Seung Chan

Results 1 issues of Roh Seung Chan

When reshaping queries, keys, and values in the intersample() function, shouldn’t they be changed to (1, h, b, n*d)? The code has a structure in which 8 heads per batch...