Skye Wanderman-Milne
Skye Wanderman-Milne
FWIW the screenshots above were generated with the host placing rides, which displayed correctly for the host, but incorrectly for the client (i.e. the screenshots are from the client). So...
Hey @ayaka14732, sorry for the delay! I was hiking in Nepal :mountain_snow: Just to make sure I understand, is the issue that `/tmp/libtpu_lockfile` sometimes exists even when no process is...
This is strange. The lockfile should be automatically deleted when the Python process exits. Can you rerun the same set of commands again, but make sure the lockfile doesn't exist...
@shashank2000 that error message is only triggered by the presence of the lockfile. Please doublecheck that the lockfile isn't there and that you're getting the exact error message?
I'm guessing you somehow have GPU preallocation disabled. You can check by setting the env var `TF_CPP_MIN_LOG_LEVEL=0`, which turns on extra JAX logging (and TensorFlow logging, we use the same...
That's very strange. Can you provide the full jax + nvidia-smi output?
Yes, I believe doing something like `with jax.default_device(jax.devices("cpu")[0])` should work. Please let me know if it doesn't, I haven't tried it with pickle.
Thanks for the fast reply! I guess consider this a feature request for something akin to `--vmodule` then :) Using `-v` for now works well enough (which is realistically what...
@gehring your understanding is correct! It shouldn't be too bad to add a separate option for allocating an absolute percentage, although will require a C++ change to TF (around [here](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/xla/pjrt/nvidia_gpu_device.cc#L118-L119))....
`XLA_PYTHON_CLIENT_PREALLOCATE=false` does only affect pre-allocation, so as you've observed, memory will never be released by the allocator (although it will be available for other DeviceArrays in the same process). You...