Hermes Trismegistus

Results 15 comments of Hermes Trismegistus

It is not an easy bug to reproduce or create a minimal example for as it doesn't seem to occur in every case and behaves differently depending on the backend....

I have isolated the bug further by eliminating the caching. The caching is apparently not the cause of the bug. Adding a single `clone()` to `q` at line 658 of...

Running `cargo update` fixed that specific example, but the data is still corrupted in other places. This is one of those tricky bugs that seems to pop up in random...

I gave it one more go and found that the main line causing issue is `let wv = qkv_attention(q, k2, v2, None, self.n_head);`. Simply using `let q = Tensor::from_data(self.query.forward(x).into_data());` rather...

Feel free to use my FFT implementation, but be aware that it isn’t as general as the one torch uses. There are some padding options and checks that should be...