accelerate icon indicating copy to clipboard operation
accelerate copied to clipboard

Megatron presence detection incorrect

Open Willian-Zhang opened this issue 2 years ago • 3 comments

System Info

- `Accelerate` version: 0.19.0
- Platform: Linux-5.15.0-71-generic-x86_64-with-glibc2.35
- Python version: 3.10.9
- Numpy version: 1.24.1
- PyTorch version (GPU?): 2.0.1+cu118 (True)
- System RAM: 503.52 GB
- GPU type: NVIDIA GeForce RTX 3090
- `Accelerate` default config:
        Not found

Information

  • [X] The official example scripts
  • [ ] My own modified scripts

Tasks

  • [ ] One of the scripts in the examples/ folder of Accelerate or an officially supported no_trainer script in the examples folder of the transformers repo (such as run_no_trainer_glue.py)
  • [X] My own task or dataset (give details below)

Reproduction

Follow guide for running megatron

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
cd ..

pip install git+https://github.com/huggingface/Megatron-LM.git

gets

│ /home/willian/miniconda3/envs/activebot/lib/python3.10/site-packages/accelerate/accelerator.py:3 │
│ 21 in __init__                                                                                   │
│                                                                                                  │
│    318 │   │                                                                                     │
│    319 │   │   if megatron_lm_plugin:                                                            │
│    320 │   │   │   if not is_megatron_lm_available():                                            │
│ ❱  321 │   │   │   │   raise ImportError("Megatron is not installed. please build it from sourc  │
│    322 │   │                                                                                     │
│    323 │   │   if ipex_plugin is None:  # init from env variables                                │
│    324 │   │   │   ipex_plugin = IntelPyTorchExtensionPlugin() if os.environ.get("IPEX_ENABLED"  │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
_is_package_available("megatron")
False

Expected behavior

It's probably because the _is_package_available("megatron") passing in the importable name: megatron whereas its implementation L56:

https://github.com/huggingface/accelerate/blob/873b39b85bc5b4eb87da9ba6af1adc0e418be907/src/accelerate/utils/imports.py#L51-L57

treats such name also as package name (which should be megatron-lm)

Willian-Zhang avatar May 10 '23 18:05 Willian-Zhang

CC @pacman100

muellerzr avatar May 10 '23 18:05 muellerzr

Hello @Willian-Zhang, thank you for bringing this to the notice, this bug has been recently introduced in the refactor of the import checks. It would be great if you already have a PR in mind to fix this issue, else I can look into this in a few days.

pacman100 avatar May 10 '23 18:05 pacman100

any update?

bestpredicts avatar May 27 '23 14:05 bestpredicts

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Jun 21 '23 15:06 github-actions[bot]