Matthew Davidow
Matthew Davidow
Tested via creating a multihost_job, simulating a maintenance event, and confirming we can see the logs even after recovering from the simulated maintenance event: Create multihost_job ``` python3 multihost_job.py --COMMAND="bash...
Tested on 2x VLP
We are seeing ``` DeprecationWarning: `product` is deprecated as of NumPy 1.25.0, and will be removed in NumPy 2.0. Please use `prod` instead. ```
For both `jax.profiler` (`profiler=xplane` in maxtext) and a GPU nsys profiler (`profiler=nsys` in maxtext) we upload the profile to the `base_output_directory` ([source](https://github.com/AI-Hypercomputer/maxtext/blob/0a919c19911ea2d99445e72a59e838f466b962c6/MaxText/pyconfig.py#L317)) Typically this directory is GCS, it can also...