Sebastian Berg
Sebastian Berg
Many of these routines used int32 for some shapes internally. Usually, that works out just fine, since all of these shapes represent only a part of the array, it is...
**This is a PoC just for awareness, it needs cleaning up and a lot of things are still broken (including a few things that I broke unnecessarily).** Overall, this is...
This re-organizes things very slightly to avoid unnecessary attribute checking and errors. In the first case, just avoid an actual error, in the second check for cupy arrays, because otherwise...
This tries to simplify the scalar handling. In part just for maintenance and a small speed boost, but largely to make it easier to support arbitrary dtypes in the scalar...
This reduces the GUfunc overhead of a simple matmul from ~60% of the operation to probably 10% or a bit more. I expect around 1/3 of the time if the...
Since simple `cupy.matmul` calls, half of the overhead is still inside the gufuncs even with the previous PR, this suggests adding a fast-path mechanism. I would love to avoid the...
**I'll keep this as draft for now, it should work OK but I want to re-think and maybe even roll back a few changes.** This restructures the single device memory...
I think NumPy 2 has been out long enough to start allowing this, at least in some code paths (e.g. constructing a structured dtypes where the fields add to be...
**EDIT: Full changed to match new state** This vendors [`spatch`](https://scientific-python.github.io/spatch/index.html) to do type dispatching for `SCIPY_ARRAY_API=1`, right now limited to `scipy.signal`. It in-lines dispatch logic, which I think is much...
Moving over from gh-175, I think stream handling needs to be discussed separately. I had thought quite a bit about it a while ago. My take-aways (very summarized) was that...