bhack
bhack
Intel extensions in beignet are in https://cgit.freedesktop.org/beignet/tree/include/CL/cl_intel.h
I think that the problem could be on libdrm and kernel version.. What versions of both are you using?
Mhh.. Can you add a print of fixed_local_sz[i] inside the loop and before modulo at https://cgit.freedesktop.org/beignet/tree/src/cl_api.c#n3031
Have you tried to debug/print that loop?
I don't know if this [Beignet Workgroup guide](https://www.freedesktop.org/wiki/Software/Beignet/optimization-guide/) is still valid.
It is important to check if `realGroupSize *= fixed_local_sz[i];` it is cumulated correctly. If you have compiled with debug symbols you can check also with gdb break points.
@gongzg How can enter in https://cgit.freedesktop.org/beignet/tree/src/cl_api.c#n3036 if local_work_size is not NULL?
Ok so probably this message was generated by autotuning code. Where is "Verification was not successful, fallback to basic kernel" in code?
Can you expand a little bit more on this?
We had a proposal for a TF2 porting at https://github.com/tensorflow/addons/issues/2388. It is under an TF ecosystem review.