Kaushik Kulkarni
Kaushik Kulkarni
This is on hold currently as @inducer and I agreed in a personal chat that: 1. The soundness checking is hokey and must wait for proper dependencies in loopy. 2....
We could hack so that the type-inference doesn't traverse instructions with circular dependencies like `insn2`. Here's the patch for it, which infers the types correctly. (I guess it would be...
Hadn't looked at https://github.com/inducer/loopy/commit/ed5d1458abb07f7d30de4854b1e4f427480e52df, that LGTM. On trying it with small kernels with reduction-like instruction pairs, it yielded the correct results (due to the second round).
Hi Sophia, Thanks for the PR. Could you please point out what additional features does `c_vec` provide that's not already present in the [`vec` iname tags](https://godbolt.org/z/3xnMdv). IIUC the advantage of...
An even simpler implementation is possible by handling the vector extensions similar to `OpenCLTarget`'s support for vector types (like `floatn, doublen`). I have demonstrated it in the [c_vecextensions_target](https://github.com/inducer/loopy/compare/c_vecextensions_target) branch. I...
Thanks for taking a look into the alternative implementation! > I think going over the typedef would be a bit neater, or is there any disadvantage to that? True. That...
> (about the implementation in PyOP2) In PyOP2 the steps that would be needed (not very different from what's already implemented): - Split `n` into `n_batch` depending on the underlying...
> Does your proposal imply that we would drop the fallback option of OpenMPSIMD pragmas completely? I agree `omp_simd` tags are nice to have as a fallback. New proposal: have...
This is ready for a look, for a better reviewing experience please see the patch on a commit-by-commit basis.
> It looks like the iname for iel_batch was dropped This looks like a bug in loopy's vectorization implementation, with reductions in them. > (and potentially other supported math functions)...