NeMo
NeMo copied to clipboard
conda installed environment cannot import classes/methods from megatron
Describe the bug
I created a conda environment and built nemo from source. However, when I type from nemo.collections import llm, I get the following error:
from nemo.collections import llm
/root/.conda/envs/nemo/lib/python3.10/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you.
import pynvml # type: ignore[import]
Import of quick_gelu from megatron.core.fusions.fused_bias_geglu failed with: Traceback (most recent call last):
File "/root/brainstorm/NeMo/nemo/utils/import_utils.py", line 319, in safe_import_from
return getattr(imported_module, symbol), True
AttributeError: module 'megatron.core.fusions.fused_bias_geglu' has no attribute 'quick_gelu'
WARNING: transformer_engine not installed. Using default recipe.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/root/brainstorm/NeMo/nemo/collections/llm/__init__.py", line 52, in <module>
from nemo.collections.llm.gpt.model import ( # noqa: F401
File "/root/brainstorm/NeMo/nemo/collections/llm/gpt/model/__init__.py", line 65, in <module>
from nemo.collections.llm.gpt.model.hyena import (
File "/root/brainstorm/NeMo/nemo/collections/llm/gpt/model/hyena.py", line 34, in <module>
from nemo.collections.llm.gpt.model.megatron.hyena.hyena_model import HyenaModel as MCoreHyenaModel
File "/root/brainstorm/NeMo/nemo/collections/llm/gpt/model/megatron/hyena/hyena_model.py", line 30, in <module>
from megatron.core.process_groups_config import ProcessGroupCollection
ImportError: cannot import name 'ProcessGroupCollection' from 'megatron.core.process_groups_config' (/root/.conda/envs/nemo/lib/python3.10/site-packages/megatron/core/process_groups_config.py)
Steps/Code to reproduce bug
My environment setup is as follows:
conda create -n nemo python==3.10.12
pip3 install torch torchvision
apt-get update && apt-get install -y libsndfile1 ffmpeg
pip install Cython packaging
git checkout main # checkout main branch of nemo
pip install -e '.[all]'
Expected behavior
I expected the package to import smoothly.
Environment details
I'm using a conda environment.
- OS version - Ubuntu 22.04.5 LTS
- PyTorch version - Stable (2.8.0)
- Python version - 3.10.12
Did you solve it? I'm also having the same problem.😭😭😭
Not yet... I think the only thing that reliably works is docker, but I'd still prefer to use conda so ideally it'd be nice to have this fixed
Try reinstalling Megatron-LM to get the latest updates:
pip uninstall megatron-core
pip install git+https://github.com/NVIDIA/Megatron-LM.git