jitify icon indicating copy to clipboard operation
jitify copied to clipboard

[FEATURE REQUEST] NVTX Ranges

Open lamarrr opened this issue 9 months ago • 4 comments

CUDF plans to adopt JITIFY for more of our kernels and UDFs, and to be able to effectively recommend them to our customers for their workloads, we need to know JITIFY's behaviors and performance characteristics.

We need the following NVTX regions:

  • JIT Compilation time ranges
  • Memory or Disk Cache Load time ranges
  • JIT cache hit rates

Additionally, we'd need:

  • A way to disable caching, this is important for benchmarking as the benchmarks are run in multiple iterations

lamarrr avatar Feb 13 '25 15:02 lamarrr

Thanks for the RFE, I like the idea. (Also great to hear you plan to use Jitify more extensively).

Is there a particular way you would suggest reporting cache hit rates via NVTX?

benbarsdell avatar Feb 14 '25 01:02 benbarsdell

Btw caching can be disabled by passing zero for max_in_mem and max_files when constructing ProgramCache, or by calling program_cache.resize(0).

benbarsdell avatar Feb 14 '25 10:02 benbarsdell

I've added NVTX integration in this commit: https://github.com/NVIDIA/jitify/commit/bf1c8c0531a9253d0a7c420fc5f35e90b79e4fad

It's in the https://github.com/NVIDIA/jitify/pull/131 branch, which I'm hoping to merge soon.

benbarsdell avatar Feb 15 '25 05:02 benbarsdell

Is there a particular way you would suggest reporting cache hit rates via NVTX?

As long as the NVTX regions are added for the entire compilation and caching process, that would be enough in the meantime for performance investigation. As for the cache hit rates, we just need to be able to query the hit rates at runtime, which is already done in your commit.

Thanks for promptly looking into this!

lamarrr avatar Feb 18 '25 15:02 lamarrr