Results 5 issues of Gary Linscott

Example game: 1. e4 Nc6 2. d4 d5 3. e5 h5 4. f4 Bf5 5. c3 e6 6. Nf3 a5 7. Bb5 Nh6 8. h3 Be4 9. O-O Nf5 10....

Need to pass depth into QSearch, or do fixup when we return.

This is a prototype of computing perplexity over the prompt input. It does so by using `n_ctx - 1` tokens as the input to the model, and computes the softmax...

enhancement
generation quality

[Draft] I'm seeing a significant difference in output logits when running with batch_size != ctx_size. I've instrumented the code to dump the logits, so I can compare them across batch_size=8,...

enhancement
generation quality

Requires installing flash attention 2.0 from https://github.com/Dao-AILab/flash-attention if flash2 = True. Gives a small speedup on my 3080 (which is not the ideal GPU to run this on, would be...