Richard Janis Goldschmidt

Results 54 comments of Richard Janis Goldschmidt

The way forward to implement an efficient integer gemm is to change the packing of the matrices that the kernel accesses. Google's gemmlowp is doing exactly that. An example can...

I was trying the `i32` kernels, and noticed that when optimizing for my CPU, which supports AVX2, that I am experiencing quite significant performance losses, compared with no optimization. This...

Assuming you are fine with `avx2`, we have the following intrinsics available: ``` _mm256_add_epi8 _mm256_add_epi16 _mm256_add_epi32 _mm256_add_epi64 _mm256_adds_epi8 _mm256_adds_epi16 _mm256_adds_epu8 _mm256_adds_epu16 ``` `add` is wrapping, `adds` is saturating. I personally...

@mratsim, have you stumboled over Google's [`gemmlowp`](https://github.com/google/gemmlowp)? From what I understood, the main ingredient for their avx2 based integer gemm is to have a more sophisticated packing routine. Instead of...

Hi @ndattani. This effort stalled because I realized that I didn't actually need integer matrices for my implementation of Al-Mohy and Higham's `expm` . Also, reading some other integer gemm...

I have filed a bug against rustc here: https://github.com/rust-lang/rust/issues/93296 Looks like I made a mistake in my github CI with respect to 1.57 (`rustup install 1.57 && rustup default stable`...

There were 2 problems with the above: 1) no linking to cublas was done; adding `-dlink` fixes that. 2) after doing that, a lot of `conflicting declaration of C function`...

@seanmonstar Apologies for the ping, but I was hoping we could merge this very innocuous change?

I was also caught by this recently changing from the tracing subscriber pretty-printer to json-formatter. I am investigating alternatives to `fmt::Subscriber`, but it would be nice if it picked up...

IIUC this already exists in form of the valuable-serde bridge and was implemented in #1862. Of course valuable is still unstable. :( One alternative is to create something like the...