flash-linear-attention [Bug]: multi-GPU, TypeError: 'NoneType' object is not a mapping

Describe the bug

Thank you very much for your excellent work! When I train with multi-GPU, the autotuner.py function in triton pops up full_nargs = {**self.nargs, **kwargs, **self.best_config.kwargs} TypeError: 'NoneType' object is not a mapping error

but when i train with single-GPU, the error doesn't trigger. So, I'd like to ask you how to train in parallel with multi-GPU without errors

Steps to reproduce the bug

File "/home/nzx/dcase_task4_sed/code/Transformer4SED-main/fla/layers/gsa.py", line 140, in forward hidden_states = self.norm(hidden_states) File "/home/nzx/anaconda3/envs/dcase2024/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/nzx/anaconda3/envs/dcase2024/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/home/nzx/dcase_task4_sed/code/Transformer4SED-main/fla/modules/layernorm.py", line 659, in forward return rms_norm_fn( File "/home/nzx/dcase_task4_sed/code/Transformer4SED-main/fla/modules/layernorm.py", line 526, in rms_norm_fn return LayerNormFn.apply( File "/home/nzx/anaconda3/envs/dcase2024/lib/python3.9/site-packages/torch/autograd/function.py", line 539, in apply return super().apply(*args, **kwargs) # type: ignore[misc] File "/home/nzx/dcase_task4_sed/code/Transformer4SED-main/fla/utils.py", line 12, in wrapper return fn(ctx, File "/home/nzx/dcase_task4_sed/code/Transformer4SED-main/fla/modules/layernorm.py", line 415, in forward y, mean, rstd, residual_out = _layer_norm_fwd( File "/home/nzx/dcase_task4_sed/code/Transformer4SED-main/fla/modules/layernorm.py", line 172, in _layer_norm_fwd _layer_norm_fwd_1pass_kernel[(M,)]( File "/home/nzx/anaconda3/envs/dcase2024/lib/python3.9/site-packages/triton/runtime/autotuner.py", line 143, in run timings = {config: self._bench(*args, config=config, **kwargs) for config in pruned_configs} File "/home/nzx/anaconda3/envs/dcase2024/lib/python3.9/site-packages/triton/runtime/autotuner.py", line 143, in timings = {config: self._bench(*args, config=config, **kwargs) for config in pruned_configs} File "/home/nzx/anaconda3/envs/dcase2024/lib/python3.9/site-packages/triton/runtime/autotuner.py", line 104, in _bench full_nargs = {**self.nargs, **current} TypeError: 'NoneType' object is not a mapping

Expected behavior

I'd like to ask you how to train in parallel with multi-GPU without errors

Environment info

torch:
triton:

Sep 21 '24 14:09 n2729648074

@n2729648074 Have you found the problems? I do not have envs in hand reproducing the bugs :-(

Sep 23 '24 18:09 yzhangcs

Has it been resolved?

Oct 23 '24 06:10 cgz6498

@cgz6498 Hi, can you reproduce the problems when running the example code in https://github.com/sustcsonglin/flash-linear-attention/tree/main/training

Oct 24 '24 03:10 yzhangcs

This issue is stale because it has been open for 30 days with no activity.

Nov 24 '24 00:11 github-actions[bot]

This issue was closed because it has been inactive for 7 days since being marked as stale.

Dec 01 '24 00:12 github-actions[bot]

flash-linear-attention flash-linear-attention copied to clipboard

[Bug]: multi-GPU, TypeError: 'NoneType' object is not a mapping

Describe the bug

Steps to reproduce the bug

Expected behavior

Environment info

flash-linear-attention
flash-linear-attention copied to clipboard