Jamie DeAntonis comments

Results 12 comments of


                                            Jamie DeAntonis

I added an arg so that you can specify the number of segments the index vectors will be quantized to

whoops, sorry about that. just changed

Causal performer slower than causal regular attention

We observed in both. I heard from [here](https://github.com/lucidrains/performer-pytorch/issues/64#issuecomment-819568003) that the reason is caching? Are you still planning to implement it?

Sentencepiece alternative

Hi all, I'm happy to join this conversation if there's anything to be done. Is this too inefficient? ```python import numpy as np q0 = np.array([[1, 2, 3], [4, 5,...

Sentencepiece alternative

>And I'm not sure the dimension of q0 (2, 3) mean (L, D)? A sequence with two tokens and each token has 3 dimensions? Yes, that's what I meant. >If...

Sentencepiece alternative

Does this mean `CausalDotProduct` in `fast-attention` is what we want?

Sentencepiece alternative

isn't this only solvable by implementing the for-loop directly in the lower-level language? I imagine this is effectively what fast attention does

Sentencepiece alternative

Can we talk over a call? I just emailed you

Sentencepiece alternative

I was just reading the fast attention code, and I think it does exactly what we want. Typing is really the only reason the c++ code is torch-specific. Otherwise, all...

Sentencepiece alternative

I don't think I'm the guy to do this (I don't use c++ or tensorflow), but I think this is a pretty easy problem for someone who at least knows...

Sentencepiece alternative

@ice-americano (who I work with) ran if and it seemed to work to some degree. Compared to regular attention, he was getting significant improvements in memory usage, but a noticeable...