kodonnell

Results 55 comments of kodonnell

FYI it was suggested that https://github.com/phoboslab/qoi/issues/145 might be more appropriate here. Wouldn't mind someone checking the results, as they were pretty compelling, especially as it's "free".

I'm afraid I don't have any such images - can you send some through? I was going to use the python image similarity package which supports other metrics, but didn't...

My intel results from this change (if relevant) are [here](https://github.com/CNugteren/CLBlast/issues/257#issuecomment-383410884).

Some more results. Note that beignet (which I used) is 10-20% slower than Intel NEO. [Intel(R) HD Graphics 6000 BroadWell U-Processor GT3.zip](https://github.com/CNugteren/CLBlast/files/1959410/Intel.R.HD.Graphics.6000.BroadWell.U-Processor.GT3.zip)

> Typically running the kernel itself is the least amount of the total tuning time. For quick kernels, yes. Some of the kernels were reporting 30 second runtimes, if I...

> A quick search ... But that does hint at out a solution (though, as above, it'd involve rewriting the kernels). > So unless someone has a solution, I'm afraid...

Also - I did originally intend this issue to include other ideas, not just my own, so I'll leave this open. (You're welcome to close it though.)

Regarding tuning, have you considered doing this in python? Performance shouldn't (?) be a problem, as the underlying kernels are the same. It's arguably going to reach a wider audience,...

Following up that comment - you could also consider not doing the tuning directly, but allowing users to implement their own searches (though you'd probably have some default ones implemented)....

Cool - no problem. > Not sure how that will work. I was more referring to a suspicion that most users interested in OpenCL will be in machine learning etc.,...