Benoit Jacob

Results 48 issues of Benoit Jacob

I'm using Tracy with sampling on Android/aarch64. Some of my symbols, that are created by a custom LLVM-based compiler (IREE), are at first not visible to TracySourceView: trying to open...

Not a real pull request. Just to illustrate issue #384.

I have run into the assertion being removed here. Apparently it was assuming that a `float` value was printing as a specific number of characters and that assumption was defeated...

I'm running the Tracy UI on Linux, but remotely: my local machine is a Mac and I'm using a remote desktop solution to use my Linux machine. At least in...

enhancement
help wanted
wontfix

to use caching of weights and use the same ordering of the matmul as in other xnnpack benchmarks. I can't test this easily for now but once the Ruy CMakeLists.txt...

cla: yes

This is currently blocked by running into this problem: Issue #9903

# Preamble ARM NEON has fixed-point multiplication instructions, like `sqdmulh`. Existing NN inference solutions (TFLite, ruy, XNNPACK) use them. It's not possible to match their performance in quantized workloads without...

performance ⚡
codegen/llvm
integrations/tosa

ARM NEON has pairwise-folding addition instructions where pairs of narrow (e.g. 8-bit) input lanes are added together and accumulated into wider (e.g. 16-bit) integer lanes. For example SADALP, SADDLP. This...

This is open-ended. The problem is that many key use cases, such as matrix multiplication kernels, need to know a number of SIMD vector registers that they can count on...

post SIMD MVP

Suppose you have two vectors u and v, and you want to multiply all elements of the vector u by a single lane of the vector v, e.g. v[0]. This...