Hi I m using v38 tpu in GCP and while loading model getting below error :
he above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/deep_c/workspace/LWM/lwm/vision_chat.py", line 254, in
run(main)
File "/home/deep_c/miniconda3/envs/large_vision_model/lib/python3.10/site-packages/absl/app.py", line 308, in run
_run_main(main, args)
File "/home/deep_c/miniconda3/envs/large_vision_model/lib/python3.10/site-packages/absl/app.py", line 254, in _run_main
sys.exit(main(argv))
File "/home/deep_c/workspace/LWM/lwm/vision_chat.py", line 250, in main
output = sampler(prompts, FLAGS.max_n_frames)[0]
File "/home/deep_c/workspace/LWM/lwm/vision_chat.py", line 230, in call
output, self.sharded_rng = self._forward_generate(
jaxlib.xla_extension.XlaRuntimeError: RESOURCE_EXHAUSTED: XLA:TPU compile permanent error. Ran out of memory in memory space hbm. Used 21.95G of 15.48G hbm.
Exceeded hbm capacity by 6.47G.
Total hbm usage >= 22.47G:
reserved 530.00M
program 21.95G
arguments 0B
How to fix this?