orbit
orbit copied to clipboard
[Bug] When Run in cluster FATAL: container creation failed: destination /mmfs1 doesn't exist in container
I started with a clean orbit pulled from this repository followed documentation's guide downloaded
Docker version 24.0.2 Docker Compose version v2.18.1 apptainer version 1.3.0
Everything succeed until running
./docker/container.sh job --task Isaac-Velocity-Rough-Anymal-C-v0 --headless --video --offscreen_render
Returned:
sbatch: error: No account specified, defaulting to: cse
sbatch: error: No partition specified, defaulting to: compute
sbatch: error: Batch job submission failed: Invalid qos specification
Since this didn't work So what I did is that I login in to the cluster and mannually ran
sh ./docker/cluster/submit_job.sh ${CLUSTER_ORBIT_DIR} --task Isaac-Velocity-Rough-Anymal-C-v0 --headless --video --offscreen_render
Job submission succeeded, but the output shows
FATAL: container creation failed: mount hook function failure: mount /var/apptainer/mnt/session/mmfs1->/mmfs1 error: while mounting /var/apptainer/mnt/session/mmfs1: destination /mmfs1 doesn't exist in container
Steps to reproduce
following the cluster guide with a clean orbit install.
Running
./docker/container.sh job --task Isaac-Velocity-Rough-Anymal-C-v0 --headless --video --offscreen_render
Returned:
sbatch: error: No account specified, defaulting to: cse
sbatch: error: No partition specified, defaulting to: compute
sbatch: error: Batch job submission failed: Invalid qos specification
Or Running
sh ./docker/cluster/submit_job.sh ${CLUSTER_ORBIT_DIR} --task Isaac-Velocity-Rough-Anymal-C-v0 --headless --video --offscreen_render
returned
(run_singularity.py): Called on compute node with arguments --task Isaac-Velocity-Rough-Anymal-C-v0 --headless --video --offscreen_render
WARNING: nv files may not be bound with --writable
WARNING: By using --writable, Apptainer can't create /mmfs1 destination automatically without overlay or underlay
FATAL: container creation failed: mount hook function failure: mount /var/apptainer/mnt/session/mmfs1->/mmfs1 error: while mounting /var/apptainer/mnt/session/mmfs1: destination /mmfs1 doesn't exist in container
-->
System Info
Describe the characteristic of your environment:
- Commit: [95a4927]
- Isaac Sim Version: 2023.1.0
- OS: Ubuntu 22.04
- Docker version 24.0.2
- Docker Compose version v2.18.1
- apptainer version 1.3.0
ACCEPT_EULA=Y
ISAACSIM_VERSION=2023.1.1
DOCKER_ISAACSIM_PATH=/isaac-sim
DOCKER_USER_HOME=/root
CLUSTER_ISAAC_SIM_CACHE_DIR=/path/to/docker-isaac-sim
CLUSTER_ORBIT_DIR=/path/to/orbit
CLUSTER_LOGIN=...........edu
CLUSTER_SIF_PATH=/path/to/sif_path/
CLUSTER_PYTHON_EXECUTABLE=source/standalone/workflows/rsl_rl/train.py
Checklist
- [x] I have checked that there is no similar issue in the repo (required)
- [x] I have checked that the issue is not in running Isaac Sim itself and is related to the repo
Acceptance Criteria
Add the criteria for which this task is considered done. If not known at issue creation time, you can add this once the issue is assigned.
- [ ] No mount issue when job submit to cluster
@pascal-roth Any idea here?
This looks like an Apptainer and Docker version issue. Can you try to use apptainer version 1.2.5-1.el7
and docker version 24.0.7
on the system where you build the singularity file?