array-api [DLPack] Update stream=None default guideline

Previously we landed stream=None mapping to legacy default stream (a safer case). As DLPack get popularized, one most canonical use-case is to exchange between library and pytorch. As most libraries are not updated to take stream passing, and many expects that the behavior is no-sync, which works better for cases like CUDAGraph:

s = torch.cuda.Stream()
x = torch.randn(8, device="cuda")
g = torch.cuda.CUDAGraph()

with torch.cuda.stream(s):
    with torch.cuda.graph(g):
        _ = x + 1
        mylib_tensor = mylib.from_dlpack(x)
        mylib_kernel(mylib_tensor)

In the above code example, if the stream=None maps to no sync(currently stream=-1), then the cuda graph capture will work out of box. Otherwise, the cudagraph capture no longer work because of the sync. This is only the choice of default behavior as mylib can always pick a specific stream to be passed in.

So the discussion only focuses on the guideline for default behavior. The original rationale of the default was that legacy stream was a "safe choice". However, as DLPack based exchange becomes popularized and CUDAGraph integration becomes criticial. It could make sense for the default to optimize for common usecases (stream=None default to nosync if applicable).

It is worth pointing out the nosync was also the implicit original behavior before the stream proposal before frameworks get updated (many only recently like in the case of torch), so many libraries may indeed implicitly relied on such behavior.

Regardless of choices here, I think we should definitely update guideline to encourage the users to explicitly pass in stream, and document the rationale of nosync behavior, relation to CUDAgraph etc, to help libraries pick.

Sep 22 '25 12:09 tqchen

see https://github.com/pytorch/pytorch/pull/163242 for more detailed discussion(in the context of pytorch and related motivation)

Sep 22 '25 12:09 tqchen

As a followup, pytorch is changing its default behavior to non-capture by default, so it is something worthwhile to follow through

Sep 24 '25 18:09 tqchen

After discussing a bit and trying to understand it better (correct me if I'm wrong)... I don't think we should discuss this as a proposal to actually change the standard (even if we end up changing the wording).

There just isn't a problem protocol wise and the current spelling is clearly better than the proposal!

The actual problem is this:

Torch was always buggy here and translate stream=None to no synchronization.
Someone fixed the bug.
Some other library (one/many/which one? I don't know yet) never passed stream= but relied on no synchronization happening.
This caused the need for a regression fix. (I have no idea if it was properly investigate how easy it would have been to fix mylib quickly.)
Rather than spelling it out as an unfortunate hot-fix and informing others about that here. That was turned into a proposal to use stream=-1 as a default forever, or change the meaning of stream=None

My grumpy opinion is that the last step is very off mark. This should always have been a temporary hot-fix in torch and with issues opened in libraries that don't pass a stream to point at.

The point of this issue then should just be to inform that there is this unfortunate behavior that torch is stuck with for the time being... And whether we should document that somewhere so others know.

What does that mean? I dunno... As of this information state, I reject the proposal to change anything.

We could in fact say that stream= must be passed in, but I don't really like it, it should certainly be, and we can write that. I am fine with noting something like: Prior to version 1.2 some library (torch *cough*) were buggy so please really really do pass it in. (And if we don't go with must pass it in then maybe we should have a torch issue to refer to.)

Oct 17 '25 12:10 seberg