Kaushik Kulkarni

Results 80 comments of Kaushik Kulkarni

@inducer: Not really. #372 includes commits from this branch, so that I could do some tests on my test problems. But I propose we merge them separately. (I've updated the...

> Could the np.isscalar(other) check be replaced (or augmented with) other.size == 1, perhaps? Replacing or augmenting with `other.size == 1` might not be the correct behavior as the resulting...

This is the log sorted by the cumulative time spent. There doesn't seem to be an obvious low hanging fruit in this case: ``` 146881842 function calls (140188954 primitive calls)...

I have one: [big_kernel.py[6.5MB]](https://gitlab.tiker.net/inducer/loopy/uploads/ba67accfcaa8225fb65b9dab7d7350f5/big_kernel_loopy.py), but on the current master takes around 50 minutes with 15GB of peak memory requirements on [koelsch](https://github.com/illinois-scicomp/machine-shop-maintenance/wiki/User-notes#koelschdtikernet-to-become-koelschcsillinoisedu).

Hi Lawrence, Thanks for the sharing the kernel. The [flamegraph (download for interactive experience)](https://gist.githubusercontent.com/kaushikcfd/301752e5890a8a1a04cc202579edc06e/raw/3f2a4ca4824b901b6a18cfbed57afb60d241a73b/long_firedrake_knl.svg) on `main` for the entire lowering pipeline tells us these are the biggest offenders: - 27%...

FWIW, I don't dislike the justification in #127. I'm concerned that it isn't that as strictly upheld i.e. there's some sort of difference between hardware axes inames and "other" kernel...

> I'm also tempted to require prohibit (with a deprecation period, sure) separate "loop entry roots" (i and l, in your kernel above) sharing a domain. For a kernel with...

IIUC the following loop nest: ``` for iel for idof1 end idof1 for idof2 end idof2 . . . for idof1000 end idof1000 end iel ``` Wouldn't this always lead...

I'm liking it, it's precisely defined and I am unable to break it. > Would iel and idof1 be allowed to share a domain? I would allow it as I...