maxtext
maxtext copied to clipboard
Support nsys profiler upload in all cases
For both jax.profiler
(profiler=xplane
in maxtext) and a GPU nsys profiler (profiler=nsys
in maxtext) we upload the profile to the base_output_directory
(source)
Typically this directory is GCS, it can also be local. However for the nsys profiler we hardcode the uploader to use gsutil source, which has two problems
- Output directory may not be GCS, so gsutil is not applicable
- Hosts may not have gsutil installed, since gsutil is not in requirements.txt
We should modify the nsys profile upload to work in all cases.
Additional context - https://github.com/AI-Hypercomputer/maxtext/pull/909 was added as a temporary fix for 2 - we won't upload the profile when gsutil is missing, so training may continue