loopy icon indicating copy to clipboard operation
loopy copied to clipboard

A code generator for array-based code on CPUs and GPUs

Results 154 loopy issues
Sort by recently updated
recently updated
newest added

Maybe the newly introduced accumulator variables should be tagged with some specifications or the naming must be standardized. In feinsum, I have to rely on internals of realize_redudction (cf https://github.com/kaushikcfd/feinsum/blob/87e3a43dff6ebbc3709a3afc9df174ab2e12c8fa/src/feinsum/tuning/impls/ifj_fe_fej_to_ei.py#L447-L448)...

There's no way to express such match criteria, maybe we need one such `MatchExpressionBase`?

The function signature for the transforms aren't finalized yet. I'm happy to make changes to these with help from the reviewers. Draft because: - [x] Pass CI - [x] Change...

Used in e.g. `split_reduction_outward`, #711. cc @kaushikcfd

TODOs: - [x] add to MemAccess, Sync _Please squash_

This currently a proof-of-concept. **Edit:** I removed the previous performance results, they were likely caused by some kind of caching of kernels. TODOs: - [x] ~~add `mutate` support to constantdict...

What do you think @inducer? This would not only make debugging cache misses easier, it could also be used to automate determinism tests (by setting `LOOPY_ABORT_ON_CACHE_MISS` to something trueish, and...

``` knl = lp.make_kernel( " { [i] : 0

- [x] This will fail until https://github.com/inducer/pymbolic/pull/151 is in. - [ ] Maybe this will wait until pymbolic 2024.1 is released, so that the Firedrake test can pass?