tracy icon indicating copy to clipboard operation
tracy copied to clipboard

Non-nested regions - options / best practice

Open PeterTh opened this issue 1 year ago • 6 comments

First of all, Tracy is extremely useful, thanks for that!

Now, the issue I want to discuss is how to best deal with imperfectly nested and/or overlapping regions. We ran into this issue in our integration, and found two ways in principle to work with these in the current version of Tracy:

  • Implement the regions as discontinuous frames. This is rather straightforward, but not applicable in our case, as we cannot uniquely enumerate/identify the regions at compile time. Even if we could do that, zones seem like a much better semantic fit for our regions, and also allow attaching more information.
  • Implement the regions by (ab)using the fiber mechanism. This is what we currently do. Basically, we pre-allocate a set of "lanes" for our potentially overlapping regions, select a free one when a region starts, and associate a fiber with that lane. We then switch to that same fiber (and back) both when starting and when stopping the associated zone (using the C api and manually managing the context lifetime).

The second option allows us to create zones that match our semantics, but it seems like quite a circuitous route to get to this goal, and not really in the spirit of the fiber feature. We also need to manually manage both the set of "fibers" used in the workaround as well as the zone contexts. Of course, we can build an abstraction around this, and will probably do so if there is no other way.

My questions are the following:

  • Is there a better way to implement this pattern which we have missed so far?
  • If not, are there any plans to natively support overlapping zones at some point? It's clear that these would have somewhat more overhead compared to the currently implemented zone semantics, so it would likely be a different interface.

PeterTh avatar Jul 14 '22 18:07 PeterTh

Zones (regions) are a representation of C scopes in threads. As such, it is not possible for a zone parent to end before its children. If you want something that behaves differently, you are no longer asking for zones. It's difficult to give a better answer without knowing your exact use case.

wolfpld avatar Jul 14 '22 21:07 wolfpld

Yeah, I realize that the zone concept isn't a great fit, that's why I called it "regions" in the title.

The use case is that we have a system which frequently has several (by necessity) asynchronous operations happening at the same time. These have a logical extent in time (from a defined start to a defined end), can be overlapping, but occur in one CPU thread. One example are asynchronous data transfer operations in a cluster.

We want to represent these in Tracy -- together with / in addition to all the normal Zones -- to gain more insight into what's going on.

(In case you are curious, the project is Celerity)

PeterTh avatar Jul 15 '22 08:07 PeterTh

What you describe sounds like a good fit for fibers. The "fibers" name might be a bit confusing, but it is just a name. The feature was done to handle any async operation, and fibers would be just what most people would associate well with it. I do not think any new development to handle "regions" would be significantly different from what fibers already offer.

wolfpld avatar Jul 15 '22 10:07 wolfpld

We currently do use fibers for it, with the implementation I outlined above. It does work, but I see two (surmountable but noteworthy) issues with it for this use case, one usability-wise, and a more minor one regarding how it is presented visually.

Usability-wise, our current implementation means that starting an asynchronous region is, in pseudo-code:

auto f = find_free_fiber();
store_fiber(f);
TracyFiberEnter(f);
TracyCZone(ctx, 1);
TracyCZoneName(ctx, descriptor, length(descriptor));
store_context(ctx);
TracyFiberLeave();

And exiting a region is:

TracyFiberEnter(get_stored_fiber());
TracyCZoneEnd(get_stored_context(), 1);
TracyFiberLeave();

Of course we can build an abstraction on top of that, but I wonder if having asynchronous regions isn't a sufficiently common pattern that it could be supported more "natively" with a single construct in the C++ API. I believe the primary use case for Tracy is games, but probably there could be uses for stuff like this there as well, in e.g. asynchronous asset loading or networking. Also -- and this is purely speculation, please correct me if I'm wrong -- I feel like a construct built for this specific use case might perform better than finding, opening and closing fibers just to put a single zone start/end pair into them.

Visually, using fibers in this way adds some "noise" displaying fiber activity, when we only really switch to and from fibers as an artifact in order to be able to start/stop the associated zone: fibers_async_zones

PeterTh avatar Jul 15 '22 14:07 PeterTh

If I would want to put these regions next to each other, like for example the CPU context switches are currently visible, the problem would be that with a generic solution you'd have to iterate an ever-growing list of regions to figure out what to draw. A more specialized solution would basically do what you are currently doing on the client side, i.e. search for a free "lane" or allocate a new one, if there is no space available. In this case the existing optimizations could be used, but there would be some theoretical downsides, like inability to filter and show two non-overlapping regions on the same "lane" if they would be originally assigned to two different ones.

I believe the primary use case for Tracy is games

You'd be surprised.

wolfpld avatar Jul 15 '22 14:07 wolfpld

I see. I'm not sure I understand the point about iterating the list of regions -- I feel like the client could keep those in a sorted 1D interval data structure and then iterating over a given interval should not be a big problem? You would of course have to iterate over the range once to figure out how much vertical space you need, and that range would continue to grow if you are looking at the entire timeline. I'm not sure how all of this works internally, but if the threads are drawn top-to-bottom then this might not need to be an extra iteration in addition to the actual drawing, which I assume is always O(N) at least.

Anyway, we can keep using our client-side workaround for now, but I wonder if you would be in principle interested in a new type of primitive (I imagine something like TracyAsyncStart and TracyAsyncEnd, with the usage/semantics that the user has to transport some data structure from start to end, and it then adds a potentially overlapping async region to the thread; those regions would be displayed below the stack of zones, and have the same features -- i.e. extra text, colors, stat visualization -- as zones).

PeterTh avatar Jul 20 '22 08:07 PeterTh