pca006132
pca006132
May I ask that is there a way to specify other `memcpy` implementation? I'm on ARM platform, and the `memcpy` is not very efficient for my use case as I...
> @pca006132 I think similar to #365, we could also add optimized variants of `memcpy` and friends for `aarch64`. I'm not sure what the best implementation would be for that...
iirc LLVM already emits small copy operation as a set of load/store instead of calling memcpy. Not sure what is the threshold though.
I think using the implementation in glibc is probably a better choice here, they are well optimized and are already using things like vectorization etc.
I've tried adding basic scala metals support: ```lua local function metals_status_handler(_, status, ctx) -- https://github.com/scalameta/nvim-metals/blob/main/lua/metals/status.lua#L36-L50 local val = {} if status.hide then val = {kind = "end"} elseif status.show then...
Also, if these execution policies are not deprecated/discouraged to use, is it legitimate to mix them? I mixed `thrust::cpp::par` (which should be sequential I think?), `thrust::tbb::par`/`thrust::omp::par` and `thrust::cuda::par` [in another...
It seems that this bug also affects OMP backend as well, which can be verified by using address sanitizer.
I think the command changed to `languagetool-commandline`?
> `BigInt::zero()` doesn't require any allocation, so if that's your main concern, I think we should be fine. The goal here was to avoid duplicating the sign logic, which `BigUint`...
> What is the scale of the numbers being multiplied? If allocation is a significant portion of the calculation, I'm guessing they are relatively small values to begin with. Yes,...