julia icon indicating copy to clipboard operation
julia copied to clipboard

add a way to add text to timing zones from julia

Open KristofferC opened this issue 7 months ago • 17 comments

Right now, there is no way to add text to a zone from the julia defined zones (text being e.g. the yellow text in)

image

When we moved the require code into Julia we for example lost the name of the package being loaded due to this.

To add this functionality, we need a way to get the "current active block" for the timing. Here I used a ScopeValue for this (inspired by https://github.com/topolarity/Tracy.jl/pull/36). However, this means that we only get access to this functionality quite late, maybe we want to be able to add text from Compiler as well. So maybe we want some more basic way to do this. I thought about using a sentinel value but we don't have any "hard scopes" that we can define a local variable in. Then I thought about the outer @zone, looking into the expression and "injecting" its own block into any timer_print calls it can find. That may work. Other ideas are welcome.

KristofferC avatar Apr 30 '25 15:04 KristofferC

Do you have a quick measurement for the overhead?

I'm guessing this makes our @zone allocating now, which is probably forgivable

topolarity avatar Apr 30 '25 19:04 topolarity

Do you have a quick measurement for the overhead?

Doing some updates now, will measure afterwards. I'm pretty sure the current Tracy.jl implementation does not allocate though. Will check that one too.

KristofferC avatar Apr 30 '25 19:04 KristofferC

Ok, I have three implementations now:

  1. A single global RefValue which we set the current block and then reset it, useful for bootstrapping, perhaps when we don't have much other functionality
  2. Storing a RefValue into TLS and updating / resetting it. Storing a RefValue seems better than a direct pointer because we can keep a reference to it and don't have to go via TLS again to reset it.
  3. Using ScopedValue.

Using the following benchmark;:

function f(print::Bool)
    Base.@zone "1" begin
        print && Base.timing_print("Hello")
        for i in 1:100
            Base.@zone "2" begin
                print && Base.timing_print("a")
            end
        end
        print && Base.timing_print("Dude!")
    end
end

f(true)

using BenchmarkTools

@btime f(false)
@btime f(true)

I get:

  1. Single RefValue
1.008 μs (0 allocations: 0 bytes)
3.146 μs (0 allocations: 0 bytes)
  1. ScopedValue
2.963 μs (606 allocations: 15.78 KiB)
4.778 μs (810 allocations: 18.97 KiB)
  1. TLS + RefValue
1.262 μs (0 allocations: 0 bytes)
1.688 μs (0 allocations: 0 bytes)

I don't understand how the TLS + RefValue is currently faster than the single global RefValue... I have to make some more checks but pushing up what I have.

KristofferC avatar Apr 30 '25 21:04 KristofferC

I think ScopedValue is what we semantically want, but we would need to have https://github.com/JuliaLang/julia/issues/53584#issuecomment-1979089217 fixed, potentially by https://github.com/JuliaLang/julia/pull/55045

vchuravy avatar Apr 30 '25 22:04 vchuravy

I think this would be good to have to restore some functionality we mused to have and allow for future customizations of the zones from julia. I'm happy for a review, but since this is not enabled by default, I'll merge in a couple of days.

KristofferC avatar May 08 '25 13:05 KristofferC

I like the task_local_storage() implementation overall, and I think it'd be preferable not to have too many variants

I'm assuming that bootstrapping doesn't support using that implementation everywhere?

topolarity avatar May 08 '25 13:05 topolarity

Thanks, I removed the ScopedValue code to make this a bit shorter and just had a reference to how that would be implemented to the Tracy PR.

KristofferC avatar May 08 '25 18:05 KristofferC

Just had an idea.. Can we use jl_current_task->ptls->timing_stack->tracy_ctx to get the timing block?

It's already being hooked into the task context by _jl_timing_block_start

topolarity avatar May 08 '25 18:05 topolarity

I can try it.

KristofferC avatar May 08 '25 18:05 KristofferC

Isn't that wrong under task migration?

gbaraldi avatar May 08 '25 19:05 gbaraldi

Isn't that wrong under task migration?

Can you explain what you think might go wrong?

topolarity avatar May 08 '25 19:05 topolarity

Push something to the stack in one thread and try to pop it from another thread.

gbaraldi avatar May 08 '25 19:05 gbaraldi

Push something to the stack in one thread and try to pop it from another thread.

Doesn't the ptls follow the migrated task?

topolarity avatar May 08 '25 19:05 topolarity

No, ptls is per thread

gbaraldi avatar May 08 '25 19:05 gbaraldi

Okay then that needs to be fixed, but it's orthogonal to this PR - Can you open an issue for it?

topolarity avatar May 08 '25 19:05 topolarity

Yeah

gbaraldi avatar May 08 '25 19:05 gbaraldi

@topolarity I tried to do this but I must be doing something wrong because I seem to only get null from the timing block:

https://github.com/JuliaLang/julia/commit/788b0e365977e5e65368c430829ac74938c1df63

KristofferC avatar May 09 '25 08:05 KristofferC