Gordon Brown

Results 61 comments of Gordon Brown
trafficstars

I agree, my thinking was that we should reintroduce the wording that we had for this in http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0796r3.pdf as we spent quite a bit of time refining that in the...

I can see the benefit of having this feature. I think the best approach to supporting this would be to have a default constructor for the accessor class when `access::placeholder`...

Thinking further about how a potential interface for this could look I see two possible ways to do this. Firstly we could have a callback mechanism, where a user provides...

Yes you're right, I think we will since we are separating the execution and memory topologies there will now be no object which you can construct from a memory resource...

That's right, the approach described here is the only correct way to guarantee the operations enqueued within the host task are synchronized with, following the current SYCL 2020 specification, however,...

@densamoilov apologies for the late reply, I hadn't seen your response. That's right this solution would only work in the case of a single in-order queue, though as it relies...

Hi @Soujanyajanga, thanks for raising this issue, I can give you an update on the progress of these operations for the Nvidia backend. For `getrf_batch` Nvidia supports an equivalent to...

@Soujanyajanga yes, I think this is the approach we would take, if you can share your workaround this could be useful, thanks, I've added this to our roadmap so someone...

Now that https://github.com/intel/llvm/pull/5095 is merged this should address the problem for the CUDA backend, so I will remove the CUDA label. @bader I believe the remaining issue here is with...

I'm not sure about Level Zero, but AFAICT OpenCL doesn't have any limitation to the global work size, the only thing I see is there's the `CL_KERNEL_GLOBAL_WORK_SIZE` query for `clGetKernelWorkGroupInfo`,...