Kaushik Kulkarni

Results 55 issues of Kaushik Kulkarni

Here's an MWE: ```python >>> import pycuda.autoinit >>> import pycuda.gpuarray as cu_np >>> a = cu_np.zeros(10, dtype="int32") + 1 >>> b = cu_np.zeros(10, dtype="int32") + 2 >>> a / b...

bug

Consider the simple batched matvec example: ```python knl = lp.make_kernel( "{[e,i,j]: 0

enhancement

CFamilyTarget should only define the callables present in the intersection of all the C-based targets. The math functions on the complex-typed operands from `complex.h` aren't common to targets like OpenCL/Cuda...

/cc @sv2518 Adds support for GNU vector extensions. TODO: - [x] Implement `OMPSimdInameTag`. - [x] In `loopy.codegen.expression` infer the fallback mechanisms from the target. - [x] Pass CI. - [...

The following kernel -- ``` knl = lp.make_kernel( "{[i, j]: 0

The implementation here is based on the paper "[Memory optimization by counting points in integer transformations of parametric polytopes](https://dl.acm.org/doi/abs/10.1145/1176760.1176771)". Draft because: - [x] Incomplete Implementation - [x] Needs `pw_qpolynomial_to_expr` -...

TODO: - [x] Do index analysis to verify the validity of the iname-duplication passes. - [x] Add complicated regressions. Draft because: - [ ] includes commits from #350. - [...

help wanted

Implementation for finding loop nest around map in O(N.k), 'N' being the number of inames and 'k' being the max. loop depth. For comparison, let's consider the kernel in #288:...