accelerate icon indicating copy to clipboard operation
accelerate copied to clipboard

All `accelerate` commands and imports fail in Singularity conversion of Docker container

Open serenalotreck opened this issue 1 year ago • 2 comments

System Info

- `Accelerate` version: 0.19.0
- Platform: Linux-3.10.0-1160.80.1.el7.x86_64-x86_64-with-glibc2.17
- Python version: 3.11.3
- Numpy version: 1.24.3
- PyTorch version (GPU?): 2.0.1 (True)
- System RAM: 187.36 GB
- GPU type: Tesla V100S-PCIE-32GB
- `Accelerate` default config:
        Not found

Information

  • [X] The official example scripts
  • [ ] My own modified scripts

Tasks

  • [ ] One of the scripts in the examples/ folder of Accelerate or an officially supported no_trainer script in the examples folder of the transformers repo (such as run_no_trainer_glue.py)
  • [ ] My own task or dataset (give details below)

Reproduction

I'm working on an HPC, so I can only use Singularity. To reproduce:

singularity pull docker://huggingface/accelerate-gpu
singularity shell accelerate-gpu.sif
conda init
source ~/.bashrc
conda activate accelerate

Then, when running any of the following snippets:

accelerate env

(The output of accelerate env that I pasted above is from outside of my Singularity container)

accelerate test
python3
>>> from transformers import pipeline

I get the same final error; the tracebacks are of course different, but they all end the same way:

Setting ds_accelerator to cuda (auto detect)
Traceback (most recent call last):
  File "/opt/conda/envs/accelerate/bin/accelerate", line 5, in <module>
    from accelerate.commands.accelerate_cli import main
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/__init__.py", line 3, in <module>
    from .accelerator import Accelerator
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/accelerator.py", line 33, in <module>
    from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/checkpointing.py", line 24, in <module>
    from .utils import (
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/utils/__init__.py", line 119, in <module>
    from .megatron_lm import (
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/utils/megatron_lm.py", line 32, in <module>
    from transformers.modeling_outputs import (
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/__init__.py", line 23, in <module>
    from .configuration_albert import ALBERT_PRETRAINED_CONFIG_ARCHIVE_MAP, AlbertConfig
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/configuration_albert.py", line 18, in <module>
    from .configuration_utils import PretrainedConfig
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/configuration_utils.py", line 25, in <module>
    from .file_utils import CONFIG_NAME, cached_path, hf_bucket_url, is_remote_url
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/file_utils.py", line 23, in <module>
    import requests
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/requests/__init__.py", line 44, in <module>
    import chardet
ModuleNotFoundError: No module named 'chardet'
Setting ds_accelerator to cuda (auto detect)
Traceback (most recent call last):
  File "/opt/conda/envs/accelerate/bin/accelerate", line 5, in <module>
    from accelerate.commands.accelerate_cli import main
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/__init__.py", line 3, in <module>
    from .accelerator import Accelerator
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/accelerator.py", line 33, in <module>
    from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/checkpointing.py", line 24, in <module>
    from .utils import (
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/utils/__init__.py", line 119, in <module>
    from .megatron_lm import (
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/utils/megatron_lm.py", line 32, in <module>
    from transformers.modeling_outputs import (
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/__init__.py", line 23, in <module>
    from .configuration_albert import ALBERT_PRETRAINED_CONFIG_ARCHIVE_MAP, AlbertConfig
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/configuration_albert.py", line 18, in <module>
    from .configuration_utils import PretrainedConfig
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/configuration_utils.py", line 25, in <module>
    from .file_utils import CONFIG_NAME, cached_path, hf_bucket_url, is_remote_url
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/file_utils.py", line 23, in <module>
    import requests
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/requests/__init__.py", line 44, in <module>
    import chardet
ModuleNotFoundError: No module named 'chardet'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/__init__.py", line 23, in <module>
    from .configuration_albert import ALBERT_PRETRAINED_CONFIG_ARCHIVE_MAP, AlbertConfig
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/configuration_albert.py", line 18, in <module>
    from .configuration_utils import PretrainedConfig
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/configuration_utils.py", line 25, in <module>
    from .file_utils import CONFIG_NAME, cached_path, hf_bucket_url, is_remote_url
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/file_utils.py", line 23, in <module>
    import requests
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/requests/__init__.py", line 44, in <module>
    import chardet
ModuleNotFoundError: No module named 'chardet'

Installing chardet by conda install chardet fails because I don't have permission to write to the target environment:

Preparing transaction: done
Verifying transaction: failed

EnvironmentNotWritableError: The current user does not have write permissions to the target environment.
  environment location: /opt/conda/envs/accelerate
  uid: 1036009
  gid: 2022

Expected behavior

For chardet to be correctly installed upon container instantiation.

serenalotreck avatar Jun 04 '23 18:06 serenalotreck

I think this is the same failure we're getting on our nightlies suddenly. Can you try via the 0.19.0 image? https://hub.docker.com/layers/huggingface/accelerate-gpu/0.19.0/images/sha256-32cdb06bc3e5c4fc64412b9cdac123dfc7870e4f00ede7bd89cc4419ac745070?context=explore

muellerzr avatar Jun 04 '23 18:06 muellerzr

@muellerzr got the same chardet error when I ran accelerate env in the 0.19.0 image

serenalotreck avatar Jun 04 '23 18:06 serenalotreck

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Jul 05 '23 15:07 github-actions[bot]

@muellerzr just wanted to follow up on this!

serenalotreck avatar Jul 05 '23 15:07 serenalotreck

Hi @serenalotreck, this looks like a transformers issue in the traceback, more than an accelerate issue, as I'm not sure what the cause would stem from this outside perhaps an outdated requests version? The sign of this is here in the traceback:

  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/configuration_utils.py", line 25, in <module>
    from .file_utils import CONFIG_NAME, cached_path, hf_bucket_url, is_remote_url
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/file_utils.py", line 23, in <module>
    import requests
  File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/requests/__init__.py", line 44, in <module>
    import chardet
ModuleNotFoundError: No module named 'chardet'

Notice how its transformers t the end when the dependencies get used

muellerzr avatar Jul 06 '23 15:07 muellerzr

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Jul 31 '23 15:07 github-actions[bot]