flash-linear-attention
flash-linear-attention copied to clipboard
[Bug]: multi-GPU, TypeError: 'NoneType' object is not a mapping
Describe the bug
Thank you very much for your excellent work! When I train with multi-GPU, the autotuner.py function in triton pops up full_nargs = {**self.nargs, **kwargs, **self.best_config.kwargs} TypeError: 'NoneType' object is not a mapping error
but when i train with single-GPU, the error doesn't trigger. So, I'd like to ask you how to train in parallel with multi-GPU without errors
Steps to reproduce the bug
File "/home/nzx/dcase_task4_sed/code/Transformer4SED-main/fla/layers/gsa.py", line 140, in forward
hidden_states = self.norm(hidden_states)
File "/home/nzx/anaconda3/envs/dcase2024/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/nzx/anaconda3/envs/dcase2024/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/home/nzx/dcase_task4_sed/code/Transformer4SED-main/fla/modules/layernorm.py", line 659, in forward
return rms_norm_fn(
File "/home/nzx/dcase_task4_sed/code/Transformer4SED-main/fla/modules/layernorm.py", line 526, in rms_norm_fn
return LayerNormFn.apply(
File "/home/nzx/anaconda3/envs/dcase2024/lib/python3.9/site-packages/torch/autograd/function.py", line 539, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
File "/home/nzx/dcase_task4_sed/code/Transformer4SED-main/fla/utils.py", line 12, in wrapper
return fn(ctx,
File "/home/nzx/dcase_task4_sed/code/Transformer4SED-main/fla/modules/layernorm.py", line 415, in forward
y, mean, rstd, residual_out = _layer_norm_fwd(
File "/home/nzx/dcase_task4_sed/code/Transformer4SED-main/fla/modules/layernorm.py", line 172, in _layer_norm_fwd
_layer_norm_fwd_1pass_kernel[(M,)](
File "/home/nzx/anaconda3/envs/dcase2024/lib/python3.9/site-packages/triton/runtime/autotuner.py", line 143, in run
timings = {config: self._bench(*args, config=config, **kwargs) for config in pruned_configs}
File "/home/nzx/anaconda3/envs/dcase2024/lib/python3.9/site-packages/triton/runtime/autotuner.py", line 143, in
Expected behavior
I'd like to ask you how to train in parallel with multi-GPU without errors
Environment info
- torch:
- triton:
@n2729648074 Have you found the problems? I do not have envs in hand reproducing the bugs :-(
Has it been resolved?
@cgz6498 Hi, can you reproduce the problems when running the example code in https://github.com/sustcsonglin/flash-linear-attention/tree/main/training
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 7 days since being marked as stale.