accelerate
accelerate copied to clipboard
All `accelerate` commands and imports fail in Singularity conversion of Docker container
System Info
- `Accelerate` version: 0.19.0
- Platform: Linux-3.10.0-1160.80.1.el7.x86_64-x86_64-with-glibc2.17
- Python version: 3.11.3
- Numpy version: 1.24.3
- PyTorch version (GPU?): 2.0.1 (True)
- System RAM: 187.36 GB
- GPU type: Tesla V100S-PCIE-32GB
- `Accelerate` default config:
Not found
Information
- [X] The official example scripts
- [ ] My own modified scripts
Tasks
- [ ] One of the scripts in the examples/ folder of Accelerate or an officially supported
no_trainer
script in theexamples
folder of thetransformers
repo (such asrun_no_trainer_glue.py
) - [ ] My own task or dataset (give details below)
Reproduction
I'm working on an HPC, so I can only use Singularity. To reproduce:
singularity pull docker://huggingface/accelerate-gpu
singularity shell accelerate-gpu.sif
conda init
source ~/.bashrc
conda activate accelerate
Then, when running any of the following snippets:
accelerate env
(The output of accelerate env
that I pasted above is from outside of my Singularity container)
accelerate test
python3
>>> from transformers import pipeline
I get the same final error; the tracebacks are of course different, but they all end the same way:
Setting ds_accelerator to cuda (auto detect)
Traceback (most recent call last):
File "/opt/conda/envs/accelerate/bin/accelerate", line 5, in <module>
from accelerate.commands.accelerate_cli import main
File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/__init__.py", line 3, in <module>
from .accelerator import Accelerator
File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/accelerator.py", line 33, in <module>
from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state
File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/checkpointing.py", line 24, in <module>
from .utils import (
File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/utils/__init__.py", line 119, in <module>
from .megatron_lm import (
File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/utils/megatron_lm.py", line 32, in <module>
from transformers.modeling_outputs import (
File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/__init__.py", line 23, in <module>
from .configuration_albert import ALBERT_PRETRAINED_CONFIG_ARCHIVE_MAP, AlbertConfig
File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/configuration_albert.py", line 18, in <module>
from .configuration_utils import PretrainedConfig
File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/configuration_utils.py", line 25, in <module>
from .file_utils import CONFIG_NAME, cached_path, hf_bucket_url, is_remote_url
File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/file_utils.py", line 23, in <module>
import requests
File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/requests/__init__.py", line 44, in <module>
import chardet
ModuleNotFoundError: No module named 'chardet'
Setting ds_accelerator to cuda (auto detect)
Traceback (most recent call last):
File "/opt/conda/envs/accelerate/bin/accelerate", line 5, in <module>
from accelerate.commands.accelerate_cli import main
File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/__init__.py", line 3, in <module>
from .accelerator import Accelerator
File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/accelerator.py", line 33, in <module>
from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state
File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/checkpointing.py", line 24, in <module>
from .utils import (
File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/utils/__init__.py", line 119, in <module>
from .megatron_lm import (
File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/accelerate/utils/megatron_lm.py", line 32, in <module>
from transformers.modeling_outputs import (
File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/__init__.py", line 23, in <module>
from .configuration_albert import ALBERT_PRETRAINED_CONFIG_ARCHIVE_MAP, AlbertConfig
File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/configuration_albert.py", line 18, in <module>
from .configuration_utils import PretrainedConfig
File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/configuration_utils.py", line 25, in <module>
from .file_utils import CONFIG_NAME, cached_path, hf_bucket_url, is_remote_url
File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/file_utils.py", line 23, in <module>
import requests
File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/requests/__init__.py", line 44, in <module>
import chardet
ModuleNotFoundError: No module named 'chardet'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/__init__.py", line 23, in <module>
from .configuration_albert import ALBERT_PRETRAINED_CONFIG_ARCHIVE_MAP, AlbertConfig
File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/configuration_albert.py", line 18, in <module>
from .configuration_utils import PretrainedConfig
File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/configuration_utils.py", line 25, in <module>
from .file_utils import CONFIG_NAME, cached_path, hf_bucket_url, is_remote_url
File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/transformers/file_utils.py", line 23, in <module>
import requests
File "/mnt/home/lotrecks/.local/lib/python3.8/site-packages/requests/__init__.py", line 44, in <module>
import chardet
ModuleNotFoundError: No module named 'chardet'
Installing chardet
by conda install chardet
fails because I don't have permission to write to the target environment:
Preparing transaction: done
Verifying transaction: failed
EnvironmentNotWritableError: The current user does not have write permissions to the target environment.
environment location: /opt/conda/envs/accelerate
uid: 1036009
gid: 2022
Expected behavior
For chardet to be correctly installed upon container instantiation.