mmsegmentation
mmsegmentation copied to clipboard
EncoderDecoder is not in the model registry.
I am installing mmsegmentation on a sif image exactly as it is recommended on this official Dockerfile. The only big difference is that I am using an Nvidia base image. Here's what I am doing:
FROM nvcr.io/nvidia/pytorch:23.03-py3
# Install MMCV
ARG MMCV="2.0.0"
RUN ["/bin/bash", "-c", "pip install openmim"]
RUN ["/bin/bash", "-c", "mim install mmengine"]
RUN ["/bin/bash", "-c", "mim install mmcv==${MMCV}"]
# Install MMSEGMENTATION
RUN git clone -b main https://github.com/open-mmlab/mmsegmentation.git /mmsegmentation
WORKDIR /mmsegmentation
ENV FORCE_CUDA="1"
run pip install -v -e .
# Install requirements
COPY requirements_pip.txt /var/tmp/requirements_pip.txt
RUN pip --no-cache-dir install -r /var/tmp/requirements_pip.txt
Yet, when I try to run a training script via Slurm, this is the error I get:
Traceback (most recent call last):
File "/mmsegmentation/tools/train.py", line 104, in <module>
main()
File "/mmsegmentation/tools/train.py", line 93, in main
runner = Runner.from_cfg(cfg)
File "/usr/local/lib/python3.8/dist-packages/mmengine/runner/runner.py", line 439, in from_cfg
runner = cls(
File "/usr/local/lib/python3.8/dist-packages/mmengine/runner/runner.py", line 406, in __init__
self.model = self.build_model(model)
File "/usr/local/lib/python3.8/dist-packages/mmengine/runner/runner.py", line 813, in build_model
model = MODELS.build(model)
File "/usr/local/lib/python3.8/dist-packages/mmengine/registry/registry.py", line 548, in build
return self.build_func(cfg, *args, **kwargs, registry=self)
File "/usr/local/lib/python3.8/dist-packages/mmengine/registry/build_functions.py", line 250, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "/usr/local/lib/python3.8/dist-packages/mmengine/registry/build_functions.py", line 100, in build_from_cfg
raise KeyError(
KeyError: 'EncoderDecoder is not in the model registry. Please check whether the value of `EncoderDecoder` is correct or it was registered as expected. More details can be found at https://mmengine.readthedocs.io/en/latest/advanced_tutorials/config.html#import-the-custom-module'
In case it's important, here's the slurm call I am making:
IMAGE="path_to_image/image.sif"
CONFIG="path_to_config/config.py"
OUTDIR="path_to_dir/output_dir/"
COMMAND="python -u /mmsegmentation/tools/train.py ${CONFIG} --work-dir ${OUTDIR}"
srun -N 1 singularity exec --nv --pwd /mmsegmentation $IMAGE $COMMAND
Can anyone tell me what's wrong? I've been banging my head against this error for quite a while.
Ok, I fixed this.
I am not sure what was causing the issue, but it turns out that it all worked when rather than placing the config.py file in a random location, I placed it inside the mmsegmentation/configs/setr path.
@openmmlab-bot is there a official singularity file provided for mmsegmentation? for those who run it in HPC
@MatCorr is that possible to share the sif you used for singularity? otherwise a def file would be also very helfpul. Thank you in advance
@bobleegogogo, I'm sending you the Dockerfile I'm using to generate the SIF. I hope it's helpful!
FROM nvcr.io/nvidia/pytorch:23.04-py3
# SLURM PMI2 version 20.11.9
RUN apt-get update -y && \
DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
bzip2 \
file \
make \
perl \
tar \
wget && \
rm -rf /var/lib/apt/lists/*
RUN mkdir -p /var/tmp && wget -q -nc --no-check-certificate -P /var/tmp https://download.schedmd.com/slurm/slurm-20.11.9.tar.bz2 && \
mkdir -p /var/tmp && tar -x -f /var/tmp/slurm-20.11.9.tar.bz2 -C /var/tmp -j && \
cd /var/tmp/slurm-20.11.9 && ./configure --prefix=/usr/local/slurm-pmi2 && \
cd /var/tmp/slurm-20.11.9 && \
make -C contribs/pmi2 install && \
rm -rf /var/tmp/slurm-20.11.9 /var/tmp/slurm-20.11.9.tar.bz2
# Install MMCV
ARG MMCV="2.0.0"
RUN ["/bin/bash", "-c", "pip install openmim"]
RUN ["/bin/bash", "-c", "mim install mmengine"]
RUN ["/bin/bash", "-c", "mim install mmcv==${MMCV}"]
# Install MMSEGMENTATION
RUN git clone -b main https://github.com/open-mmlab/mmsegmentation.git /mmsegmentation
WORKDIR /mmsegmentation
ENV FORCE_CUDA="1"
run pip install -v -e .
@MatCorr great, really appreciate it ;)
I didnt build mmsegmentation from source but build it as a dependence. Where should I place the config file? I get this error. It is official model type and it is so confusing.
@Sere1nz, if I remember correctly, if you create a new config file, you can't just place it anywhere you feel like. You must place it inside the configs folder of mmsegmentation; that is, somewhere inside one of these folders.
This issue still persists even if we place the file under configs > any_model > your_config.py. I don't know what's causing the issue and inspected all model fields.
即使我们将文件放在 configs > any_model > your_config.py下,此问题仍然存在。我不知道是什么原因导致了问题,并检查了所有模型字段。
I have same problem ,you have solved it ?