accelerate icon indicating copy to clipboard operation
accelerate copied to clipboard

All `accelerate` commands and imports fail in Singularity conversion of Docker container

Open serenalotreck opened this issue 1 year ago • 2 comments

System Info

- `Accelerate` version: 0.19.0
- Platform: Linux-3.10.0-1160.80.1.el7.x86_64-x86_64-with-glibc2.17
- Python version: 3.11.3
- Numpy version: 1.24.3
- PyTorch version (GPU?): 2.0.1 (True)
- System RAM: 187.36 GB
- GPU type: Tesla V100S-PCIE-32GB
- `Accelerate` default config:
        Not found

Information

  • [X] The official example scripts
  • [ ] My own modified scripts

Tasks

  • [ ] One of the scripts in the examples/ folder of Accelerate or an officially supported no_trainer script in the examples folder of the transformers repo (such as run_no_trainer_glue.py)
  • [ ] My own task or dataset (give details below)

Reproduction

I'm working on an HPC, so I can only use Singularity. To reproduce:

singularity pull docker://huggingface/accelerate-gpu
singularity shell accelerate-gpu.sif
conda init
source ~/.bashrc
conda activate accelerate

Then, when running any of the following snippets:

accelerate env

(The output of accelerate env that I pasted above is from outside of my Singularity container)

accelerate test
python3
>>> from transformers import pipeline

I get the same final error; the tracebacks are of course different, but they all end the same way:

Setting ds_accelerator to cuda (auto detect)
Traceback (most recent call last):
  File "/opt/conda/envs/accelerate/bin/accelerate", line 5, in <module>
    from accelerate.commands.accelerate_cli import main
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/__init__.py", line 3, in <module>
    from .accelerator import Accelerator
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/accelerator.py", line 33, in <module>
    from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/checkpointing.py", line 24, in <module>
    from .utils import (
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/utils/__init__.py", line 119, in <module>
    from .megatron_lm import (
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/utils/megatron_lm.py", line 32, in <module>
    from transformers.modeling_outputs import (
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/__init__.py", line 23, in <module>
    from .configuration_albert import ALBERT_PRETRAINED_CONFIG_ARCHIVE_MAP, AlbertConfig
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/configuration_albert.py", line 18, in <module>
    from .configuration_utils import PretrainedConfig
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/configuration_utils.py", line 25, in <module>
    from .file_utils import CONFIG_NAME, cached_path, hf_bucket_url, is_remote_url
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/file_utils.py", line 23, in <module>
    import requests
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/requests/__init__.py", line 44, in <module>
    import chardet
ModuleNotFoundError: No module named 'chardet'
Setting ds_accelerator to cuda (auto detect)
Traceback (most recent call last):
  File "/opt/conda/envs/accelerate/bin/accelerate", line 5, in <module>
    from accelerate.commands.accelerate_cli import main
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/__init__.py", line 3, in <module>
    from .accelerator import Accelerator
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/accelerator.py", line 33, in <module>
    from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/checkpointing.py", line 24, in <module>
    from .utils import (
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/utils/__init__.py", line 119, in <module>
    from .megatron_lm import (
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/utils/megatron_lm.py", line 32, in <module>
    from transformers.modeling_outputs import (
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/__init__.py", line 23, in <module>
    from .configuration_albert import ALBERT_PRETRAINED_CONFIG_ARCHIVE_MAP, AlbertConfig
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/configuration_albert.py", line 18, in <module>
    from .configuration_utils import PretrainedConfig
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/configuration_utils.py", line 25, in <module>
    from .file_utils import CONFIG_NAME, cached_path, hf_bucket_url, is_remote_url
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/file_utils.py", line 23, in <module>
    import requests
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/requests/__init__.py", line 44, in <module>
    import chardet
ModuleNotFoundError: No module named 'chardet'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/__init__.py", line 23, in <module>
    from .configuration_albert import ALBERT_PRETRAINED_CONFIG_ARCHIVE_MAP, AlbertConfig
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/configuration_albert.py", line 18, in <module>
    from .configuration_utils import PretrainedConfig
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/configuration_utils.py", line 25, in <module>
    from .file_utils import CONFIG_NAME, cached_path, hf_bucket_url, is_remote_url
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/file_utils.py", line 23, in <module>
    import requests
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/requests/__init__.py", line 44, in <module>
    import chardet
ModuleNotFoundError: No module named 'chardet'

Installing chardet by conda install chardet fails because I don't have permission to write to the target environment:

Preparing transaction: done
Verifying transaction: failed

EnvironmentNotWritableError: The current user does not have write permissions to the target environment.
  environment location: /opt/conda/envs/accelerate
  uid: 1036009
  gid: 2022

Expected behavior

For chardet to be correctly installed upon container instantiation.

serenalotreck avatar Jun 04 '23 18:06 serenalotreck