Cedric Nugteren
Cedric Nugteren
Hmm. Let's see. About the compilation error you get, I think you should remove the function `stod` here: https://github.com/CNugteren/CLBlast/blob/master/src/utilities/android.hpp#L33 (3 lines of code). It was added there because compilers on...
I've added a test branch (`adreno_tryout`) in CLBlast to test the Qualcomm-provided kernel from the tutorial mentioned above. This is a very hacky integration of that kernel and is in...
Thanks both for trying out, very useful! @sivagnanamn: I'm working on improving compilation. This `database.cpp` was always a tricky one, I improved things over time, but I will now try...
Thanks for letting us know! I'm planning to work on this again a bit in the coming weeks, I'll try to cover all the above kernels within CLBlast such that...
Indeed, in the `adreno_tryout` branch the kernel assumes multiples of 16 or 32 or so, and you have e.g. k=27. That won't work. I hope to work on this again...
For the record, I've been working on this issue the last week. Basically I took the kernel from the `adreno_tryout` branch and am generalising it with the regular CLBlast tuning...
As I said above, I took the kernel from the `adreno_tryout` branch and have fully integrated it into CLBlast. That means it is also tuneable with many of the same...
OK, but that's perhaps an issue on your device/platform? I'll continue development a bit (some related tests still fail: SYRK/HERK) and then I'll test on some other platforms as well....
I have completely finished the implementation and also tested on another machine, no issues seen so far. I'll soon merge the branch with master.. Did you see good performance while...
FYI, this branch is now merged into master.