loopy icon indicating copy to clipboard operation
loopy copied to clipboard

A code generator for array-based code on CPUs and GPUs

Results 154 loopy issues
Sort by recently updated
recently updated
newest added

```python3 import loopy as lp import vmprof def make_kernel(): n_insn = 2000 # 'ndomains' must be greater than 'n' as it also includes the iname domains # of sub-arrays refs....

(PR currently just for comparison. This will eventually be merged into [new-dependency-and-nest-constraint-semantics-development](https://github.com/inducer/loopy/tree/new-dependency-and-nest-constraint-semantics-development) after `scheck-new-dependencies-against-sio` is merged.) Given a loopy kernel with legacy dependencies, create dependencies of the new formalized kind....

Thanks to @kaushikcfd, scheduling is now super fast compared to other parts of loopy. Still, `make_kernel, preprocess_kernel, codegen` take so much time that some sumpy kernels are unusable. Here's a...

Consider the [`face_mass` kernel](https://github.com/inducer/grudge/blob/c7cd63da6ea1dfc1b992b0bb15760ca98a9a08f4/grudge/execution.py#L339). The `f` and `j` dimensions could be joined, shrinking the array dimensions from 3D to 2D. Currently there is a [`split_array_axis`](https://github.com/inducer/loopy/blob/186f5095a54982b7eb2fda5e4b995d7c047fde1e/loopy/transform/padding.py#L369) transformation but no `join_array_axes` transformation...

Minimal reproducer: ```python knl = lp.make_kernel( "{:}", """ y = 2*x """) evt, (out,) = knl(cq, x=np.asarray(3.0, dtype=float)) print(repr(out)) # prints: cl.Array(6.) ``` However if 'y' was anything other than...

x-ref: https://github.com/inducer/meshmode/pull/143

Suggested here: https://github.com/inducer/loopy/pull/258#discussion_r602614785

Any tags that are not used in code generation should likely be stripped before entering into code generation. Otherwise, we'll get cache misses (and run code generation redundantly) for no...

See https://github.com/inducer/loopy/pull/245#discussion_r592905839

In no-numpy mode, the way one finds out that one can't pass a numpy array is by a failing `.offset` attribute access. We should (at least optionally) check argument types....