RuntimeError: 0 active drivers ([]). There should only be one.
Happened after I install deepspeed - related to https://github.com/deepspeedai/DeepSpeed/issues/7028
File "/home/mertunsal/miniconda3/envs/verl/lib/python3.10/site-packages/ray/_private/worker.py", line 929, in get_objects
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(RuntimeError): ray::TaskRunner.run() (pid=818177, ip=10.220.1.179, actor_id=b403b4db31bb6ea1ca6ca26401000000, repr=<main_ppo.TaskRunner object at 0x7f605bf50040>)
File "/home/mertunsal/verl/verl/trainer/main_ppo.py", line 99, in run
from verl.workers.fsdp_workers import ActorRolloutRefWorker, CriticWorker
File "/home/mertunsal/verl/verl/workers/fsdp_workers.py", line 41, in <module>
from verl.workers.sharding_manager.fsdp_ulysses import FSDPUlyssesShardingManager
File "/home/mertunsal/verl/verl/workers/sharding_manager/__init__.py", line 26, in <module>
if is_megatron_core_available() and is_vllm_available():
File "/home/mertunsal/verl/verl/utils/import_utils.py", line 26, in is_megatron_core_available
from megatron.core import parallel_state as mpu
File "/home/mertunsal/miniconda3/envs/verl/lib/python3.10/site-packages/megatron/core/__init__.py", line 2, in <module>
import megatron.core.tensor_parallel
File "/home/mertunsal/miniconda3/envs/verl/lib/python3.10/site-packages/megatron/core/tensor_parallel/__init__.py", line 2, in <module>
from .cross_entropy import vocab_parallel_cross_entropy
File "/home/mertunsal/miniconda3/envs/verl/lib/python3.10/site-packages/megatron/core/tensor_parallel/cross_entropy.py", line 7, in <module>
from megatron.core.parallel_state import (
File "/home/mertunsal/miniconda3/envs/verl/lib/python3.10/site-packages/megatron/core/parallel_state.py", line 14, in <module>
from .utils import GlobalMemoryBuffer, is_torch_min_version
File "/home/mertunsal/miniconda3/envs/verl/lib/python3.10/site-packages/megatron/core/utils.py", line 1405, in <module>
from transformer_engine.pytorch.float8_tensor import Float8Tensor
File "/home/mertunsal/miniconda3/envs/verl/lib/python3.10/site-packages/transformer_engine/__init__.py", line 13, in <module>
from . import pytorch
File "/home/mertunsal/miniconda3/envs/verl/lib/python3.10/site-packages/transformer_engine/pytorch/__init__.py", line 81, in <module>
from transformer_engine.pytorch.permutation import (
File "/home/mertunsal/miniconda3/envs/verl/lib/python3.10/site-packages/transformer_engine/pytorch/permutation.py", line 11, in <module>
import transformer_engine.pytorch.triton.permutation as triton_permutation
File "/home/mertunsal/miniconda3/envs/verl/lib/python3.10/site-packages/transformer_engine/pytorch/triton/permutation.py", line 123, in <module>
def _permute_kernel(
File "/home/mertunsal/miniconda3/envs/verl/lib/python3.10/site-packages/triton/runtime/autotuner.py", line 368, in decorator
return Autotuner(fn, fn.arg_names, configs, key, reset_to_zero, restore_value, pre_hook=pre_hook,
File "/home/mertunsal/miniconda3/envs/verl/lib/python3.10/site-packages/triton/runtime/autotuner.py", line 130, in __init__
self.do_bench = driver.active.get_benchmarker()
File "/home/mertunsal/miniconda3/envs/verl/lib/python3.10/site-packages/triton/runtime/driver.py", line 23, in __getattr__
self._initialize_obj()
File "/home/mertunsal/miniconda3/envs/verl/lib/python3.10/site-packages/triton/runtime/driver.py", line 20, in _initialize_obj
self._obj = self._init_fn()
File "/home/mertunsal/miniconda3/envs/verl/lib/python3.10/site-packages/triton/runtime/driver.py", line 8, in _create_driver
raise RuntimeError(f"{len(actives)} active drivers ({actives}). There should only be one.")
RuntimeError: 0 active drivers ([]). There should only be one.
Do you have to use deepspeed? You can simply uninstall deepspeed as verl does not require deepspeed.
不兼容,我也遇到了,必须卸载才能用
File "/usr/local/lib/python3.11/site-packages/verl/trainer/main_ppo.py", line 87, in run
from verl.workers.megatron_workers import ActorRolloutRefWorker, CriticWorker
File "/usr/local/lib/python3.11/site-packages/verl/workers/megatron_workers.py", line 26, in
使用Megatron-LM作为后端时,也遇到了这个问题,请问目前有解决方式吗?
不兼容,我也遇到了,必须卸载才能用
我也是遇到了这个问题,卸载deepspeed后解决了,甚至是vllm和triton的兼容性问题也是这样解决的,属实逆天
不兼容,我也遇到了,必须卸载才能用
我也是遇到了这个问题,卸载deepspeed后解决了,甚至是vllm和triton的兼容性问题也是这样解决的,属实逆天
File "/home/ma-user/anaconda3/envs/pt/lib/python3.10/site-packages/triton/runtime/autotuner.py", line 368, in decorator
return Autotuner(fn, fn.arg_names, configs, key, reset_to_zero, restore_value, pre_hook=pre_hook,
File "/home/ma-user/anaconda3/envs/pt/lib/python3.10/site-packages/triton/runtime/autotuner.py", line 130, in __init__
self.do_bench = driver.active.get_benchmarker()
File "/home/ma-user/anaconda3/envs/pt/lib/python3.10/site-packages/triton/runtime/driver.py", line 23, in __getattr__
self._initialize_obj()
File "/home/ma-user/anaconda3/envs/pt/lib/python3.10/site-packages/triton/runtime/driver.py", line 20, in _initialize_obj
self._obj = self._init_fn()
File "/home/ma-user/anaconda3/envs/pt/lib/python3.10/site-packages/triton/runtime/driver.py", line 8, in _create_driver
raise RuntimeError(f"{len(actives)} active drivers ({actives}). There should only be one.")
RuntimeError: 0 active drivers ([]). There should only be one.
卸载了 deepspeed 还是一样的错咋办呀
不兼容,我也遇到了,必须卸载才能用
我也是遇到了这个问题,卸载deepspeed后解决了,甚至是vllm和triton的兼容性问题也是这样解决的,属实逆天
File "/home/ma-user/anaconda3/envs/pt/lib/python3.10/site-packages/triton/runtime/autotuner.py", line 368, in decorator return Autotuner(fn, fn.arg_names, configs, key, reset_to_zero, restore_value, pre_hook=pre_hook, File "/home/ma-user/anaconda3/envs/pt/lib/python3.10/site-packages/triton/runtime/autotuner.py", line 130, in __init__ self.do_bench = driver.active.get_benchmarker() File "/home/ma-user/anaconda3/envs/pt/lib/python3.10/site-packages/triton/runtime/driver.py", line 23, in __getattr__ self._initialize_obj() File "/home/ma-user/anaconda3/envs/pt/lib/python3.10/site-packages/triton/runtime/driver.py", line 20, in _initialize_obj self._obj = self._init_fn() File "/home/ma-user/anaconda3/envs/pt/lib/python3.10/site-packages/triton/runtime/driver.py", line 8, in _create_driver raise RuntimeError(f"{len(actives)} active drivers ({actives}). There should only be one.") RuntimeError: 0 active drivers ([]). There should only be one.卸载了 deepspeed 还是一样的错咋办呀
我也遇到了这个问题,请问你怎么解决的
pip install triton==3.1.0可以解决
如果环境是 torch 2.6.0+cu124 depends on triton==3.2.0,装不了triton==3.1.0咋办。