tokio Idea: Differentiate between Task-local and "spawn-propagated" Task-local

Task-local data is not propagated across calls to spawn, which aligns with the behavior of Thread-local data. But since Tasks are "lighter" than threads, a common pattern is to effectively create a Task tree in order to provide parallelism while serving a request. This kind of tree might benefit from a variant of Task-local that was automatically propagated across spawns by the runtime.

An example that we encountered (related to https://github.com/tokio-rs/tracing: we'll probably try to port to it at some point) seems to be a natural fit for spawn-propagated Task-locals: all of the tracing state information in your Task tree should be propagated across spawn boundaries in order to capture the entire tree of Tasks in one "trace".

To achieve this in our usecase, we attempt to inject propagation of the tracing information at every spawn boundary... but this is error prone, because a user can forget and call tokio::spawn directly without propagation.

Apr 11 '20 23:04 stuhood

@hawkw : This is related to #1820 (which is maybe stale given LocalKey?) where you mention that the tracing spans are "ambient". Does tracing have a hook into the tokio runtime similar to what is described here?

Apr 12 '20 00:04 stuhood

@stuhood tracing does not integrate with the Tokio runtime directly, nor does it use task-local storage. Instead, the tracing-futures crate provides a future combinator that wraps individual futures so that they enter a context in their poll method, and exit that context when the poll returns.

I think the idea of inheriting task-local values is definitely an interesting one! I'm not sure how well it would work for the tracing crate specifically, because we've generally found that users tend to want spans with the granularity of individual futures rather than whole tasks, so we would probably still be setting the current span on poll. However, if we stored it in a task local that was inherited by spawned tasks, we could then use that mechanism for automatically propagating span context across spawns.

This isn't directly related to this issue, but a brief discussion of the general problem being solved:

tracing doesn't currently have automatic propagation of spans across spawn points. This is partially because I think it depends somewhat on the use-case...sometimes, you don't always want a spawned task to be in the same trace span as its parent. Instead, we have an in_curent_span combinator for manually propagating contexts, so you can write

tokio::spawn(my_future.in_current_span())

I'd definitely like to provide nicer ergonomics for propagating across spawns in cases where that's the behavior you want. But, I think it's important to ensure that there is, at least, a way to opt out.

Apr 13 '20 18:04 hawkw

I also need this functionality for #1845. I was thinking it might be possible to do via an internal task-local consisting of type AnyMap = HashMap<SomeIdentifier, Box<dyn Any>>.

We would have something like inheritable_task_local! which would access the map values. On spawn, we would just copy the entire map to the spawned task.

This assumes the following:

Modifying or removing a key in the map after spawn will not be reflected in the spawned task.
Values in the map need to be Clone + Send.

I will try to hack something up.

Apr 19 '20 01:04 gardnervickers

@gardnervickers Did you make any progress on this? I'm interested in helping with this.

Apr 29 '24 17:04 Xaeroxe

I made a version of this which can work outside of tokio.

https://github.com/Xaeroxe/tokio-inherit-task-local

If this implementation were moved inside of tokio then there could be no need to annotate the future with .inherit_task_local() as I have in my examples. I'm open to integrating this (or something like it) into tokio proper. For now, I suppose if you need this functionality it's available in that repo. I'm also open to publishing on crates.io if there is demand for that.

May 07 '24 22:05 Xaeroxe

One thing I'd like to point out is that my version has a couple of slightly odd restrictions due to how I'm using ctor. This was done to create high performance unique indexes into the table at startup without the need for the programmer to consider what those keys might be. This is just one technique for generating unique keys, and I'm open to the idea of another technique being used. I wouldn't even insist that the keys have to be integers. They could be used to access a HashMap instead.

May 07 '24 22:05 Xaeroxe

I opted to publish this on crates.io. Prior to doing this I removed my dependency on ctor and opted to use a HashMap instead. This makes the crate more broadly compatible with many targets.

May 08 '24 23:05 Xaeroxe

tokio tokio copied to clipboard

Idea: Differentiate between Task-local and "spawn-propagated" Task-local

tokio
tokio copied to clipboard