memorizing-transformers-pytorch icon indicating copy to clipboard operation
memorizing-transformers-pytorch copied to clipboard

Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memories using approximate nearest neighbors, in Pytorch

Results 10 memorizing-transformers-pytorch issues
Sort by recently updated
recently updated
newest added

I have two questions about the key and value calculation in Attention (and similarly for KNNAttention). The relevant line is: https://github.com/lucidrains/memorizing-transformers-pytorch/blob/83fa1479d6f7881dd977fbff55681e709e3b250e/memorizing_transformers_pytorch/memorizing_transformers_pytorch.py#L135 1. Why is there only one Linear layer `to_kv`,...

https://github.com/lucidrains/memorizing-transformers-pytorch/blob/83fa1479d6f7881dd977fbff55681e709e3b250e/memorizing_transformers_pytorch/memorizing_transformers_pytorch.py#L237 Shouldn't this be (1-scale)?

Hey! Cool repo. I like all the knn+lm methods Did you do some runs yet? Anything interesting to report?

Hello and thanks for this implementation! Do you know of any solutions to efficiently solve the "hard reset" problem in FAISS? I know that one could use IndexFlatL2 but that's...

when I run train.py, error like this ,"index out of range: Tried to access index 10218 out of table with 255 rows. at /pytorch/aten/src/TH/generic/THTensorEvenMoreMath.cpp:418"happens

Thank you so much for the great implementation. I would like to ask whether your implementation for Memorizing Transformer could support multi-card distributed training like original paper. If you distribute...

current environment: - faiss 1.7.1 - faiss-cpu 1.7.4 - joblib 1.3.1 - numpy 1.25.1 - pip 23.1.2 - setuptools 67.8.0 - wheel 0.38.4 I don't install pytorch yet, because not...

curious/puzzled. Would google really release their model in pytorch? Is this the official implementation of Memorizing Transformers? (btw, great work!)