openpi
openpi copied to clipboard
error occurs when run remote inference
I use the command:
uv run scripts/serve_policy.py policy:checkpoint --policy.config=pi05_libero --policy.dir=gs://openpi-assets/checkpoints/pi0_base
but it throws an error:
INFO:root:Loading model...
INFO:2025-10-21 05:46:22,801:jax._src.xla_bridge:925: Unable to initialize backend 'rocm': module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig'
INFO:jax._src.xla_bridge:Unable to initialize backend 'rocm': module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig'
INFO:2025-10-21 05:46:22,801:jax._src.xla_bridge:925: Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: libtpu.so: cannot open shared object file: No such file or directory
INFO:jax._src.xla_bridge:Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: libtpu.so: cannot open shared object file: No such file or directory
INFO:absl:orbax-checkpoint version: 0.11.13
INFO:absl:Created BasePyTreeCheckpointHandler: use_ocdbt=True, use_zarr3=False, pytree_metadata_options=PyTreeMetadataOptions(support_rich_types=False), array_metadata_store=<orbax.checkpoint._src.metadata.array_metadata_store.Store object at 0x7fccf0f73ad0>
INFO:absl:Restoring checkpoint from /home/wjg/.cache/openpi/openpi-assets/checkpoints/pi0_base/params.
INFO:absl:[thread=MainThread] Failed to get flag value for EXPERIMENTAL_ORBAX_USE_DISTRIBUTED_PROCESS_ID.
INFO:absl:[process=0][thread=MainThread] No metadata found for any process_index, checkpoint_dir=/home/wjg/.cache/openpi/openpi-assets/checkpoints/pi0_base/params. time elapsed=0.00042700767517089844 seconds. If the checkpoint does not contain jax.Array then it is expected. If checkpoint contains jax.Array then it should lead to an error eventually; if no error is raised then it is a bug.
INFO:absl:[process=0] /jax/checkpoint/read/bytes_per_sec: 3.7 GiB/s (total bytes: 48.3 GiB) (time elapsed: 13 seconds) (per-host)
INFO:absl:Finished restoring checkpoint in 13.18 seconds from /home/wjg/.cache/openpi/openpi-assets/checkpoints/pi0_base/params.
Traceback (most recent call last):
File "/data/wjg_files/openpi/scripts/serve_policy.py", line 122, in
- at keypath '': expected <class 'dict'> with 5 children, got <class 'dict'> with 3 children, so the numbers of children do not match, with the symmetric difference of key sets: {'time_mlp_out' 'time_mlp_in'}.
how can i solve it?