kineto
kineto copied to clipboard
A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.
Summary: Although hipGetDeviceProperties shows 8 devices enumerating from 0 to 7, when using roctracer_record_t, the record->device_id enumerates from 2 to 9. Manually enumerate from 0-7 by subtracting 2, and opened...
Summary: Users have complained that STAGE level logs are print by default. If a user is running many profiles it can certainly clutter STDOUT. Lower the STAGE level such that...
The trace view is one of my most important tools for profiling pytorch programs, but for the last few days I cannot manage to get it to display anything: ...
Hi guys, I'm recently trying to use ` torch.profile` for profiling of a large NLP model. However, I have encountered some problems and would like to get some advice: 1....
PyTorch Lightning saves traces in the format `fit-profiler-[Strategy]SingleDeviceStrategy-ts.pt.trace.json`, and thus the `traceName` key contains a square brace `]`. Since the `traceName` key is the final key in the JSON file,...
Any updates on the [documentation](https://github.com/pytorch/kineto/blob/main/libkineto/README.md#how-libkineto-works)? Would be interested in learning more about how libkineto works so as to extend beyond pytorch applications. Thanks!
To follow-up on the discussion in https://github.com/pytorch/kineto/pull/868, we can continue the discussion for which clock to use for timestamp collection. @mwootton pointed out that we should always be using a...
## Summary Currently the memory profiler feature in PyTorch is available via the [profiler API](https://pytorch.org/tutorials/recipes/recipes/profiler_recipe.html#using-profiler-to-analyze-memory-consumption) by passing `profile_memory=True` in the interface. It is desirable to also enable memory profiling using...
Hi guys, Follow the steps in [README.md](https://github.com/pytorch/kineto/tree/main/libkineto), I have succeed to build Libkineto. Then, I start to run the tests with the command "make test", but it doesn't change anything....
Most HPC systems use [Environment Modules](https://github.com/cea-hpc/modules) to load libraries. Therefore, the ROCm libraries are loaded explicitly by version: `module load rocm/5.6.0`. On the systems I've used ROCM_PATH is defined as...