loopy icon indicating copy to clipboard operation
loopy copied to clipboard

A code generator for array-based code on CPUs and GPUs

Results 138 loopy issues
Sort by recently updated
recently updated
newest added

Even with #755, attempting to prefetch many arrays scales poorly. By the 19th add_prefetch operation it takes around 5 seconds for add_prefetch to complete on one fused Mirgecom kernels with...

I'm not sure why it becomes unschedulable. ```python import loopy as lp import numpy as np from pymbolic.primitives import * import immutables e2p_from_single_box_knl = lp.make_kernel( [ "[ntgt_boxes] -> { [itgt_box]...

``` knl = lp.make_kernel( [ "{[i,j]: 0

For eg following code fails with `AttributeError: 'SeparateArrayArrayDimTag' object has no attribute 'stride'` ```python import loopy as lp import numpy as np child_knl = lp.make_function( [], """ g[0] = 2*e[0]...

For eg: ```python knl = lp.make_kernel( [ "{ [i]: 0

Context: https://github.com/inducer/loopy/pull/698#issuecomment-1306451565 IMO, this should only apply to code generation, not transforms. Transforms can receive permission to ignore FP reordering individually. Another aspect is that reductions do not even *have*...

Analogous to https://github.com/inducer/pyopencl/issues/668. We probably want to restrict this checking to `__debug__` mode.

Failure on pocl-cuda with `n=16` can be reproduced locally. With intel-cpu it is not reproduced locally, and is intermittent on CI. See https://github.com/inducer/loopy/actions/runs/3787208816/jobs/6438795872 Oclgrind, NVIDIA, pocl-pthread all work. Wonder if...