unified-runtime icon indicating copy to clipboard operation
unified-runtime copied to clipboard

[CUDA][HIP] Associate queue with device in context

Open hdelan opened this issue 1 year ago • 2 comments

Making a native queue doesn't require hDevice to be non null, but this associates the queue with a null device, even if hContext contains valid devices.

This fixes the behaviour for a single device context, making the queue be associated with the only device in the context. But there is a question mark over whether this entry point makes sense for a multi device UR context. Should we be able to construct a queue with a native queue without knowing what the UR device associated with the queue is? Probably not.

hdelan avatar Aug 19 '24 16:08 hdelan

This fixes the behaviour for a single device context, making the queue be associated with the only device in the context. But there is a question mark over whether this entry point makes sense for a multi device UR context. Should we be able to construct a queue with a native queue without knowing what the UR device associated with the queue is? Probably not.

This is a weird case and maybe we need to consider what the expectation from the UR API is. In the L0 adapter it just picks the first device in the platform, not even caring if it is actually in the context.

steffenlarsen avatar Aug 20 '24 06:08 steffenlarsen

This is a weird case and maybe we need to consider what the expectation from the UR API is.

I think that's a good idea. I am inclined to think that we shouldn't allow the hDevice param to be null so that we always have a valid device associated with the queue. We can optionally check if hContext contains hDevice but I think the context param is sort of meaningless.

In the L0 adapter it just picks the first device in the platform, not even caring if it is actually in the context.

That seems bad.

hdelan avatar Aug 20 '24 09:08 hdelan

Could we separate the discussion about multi-device context from this single-device context fix? This PR brings back the (relatively uncontroversial) behaviour broken recently and we have a customer waiting for this fix.

rafbiels avatar Sep 02 '24 10:09 rafbiels

Good idea. Sorry I have been sleeping on this one. I'll make an issue about this question and then hopefully we can merge this.

hdelan avatar Sep 02 '24 10:09 hdelan

https://github.com/oneapi-src/unified-runtime/issues/2041

hdelan avatar Sep 02 '24 10:09 hdelan

Ping @steffenlarsen are you happy to continue discussion about this entry point in the linked issue? If so are you happy with this as a quick-fix change?

hdelan avatar Sep 03 '24 17:09 hdelan