Megatron-LM
Megatron-LM copied to clipboard
[BUG] text generation not working for --position-embedding-type rope
When I use the official examples/run_text_generation
scripts, I can run normally without rope. But when I use rope, the text generation will raise 500 error:
return self._call_impl(*args, **kwargs)
File "/envs//megatron/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl
return forward_call(*args, **kwargs)
File "/envs//Megatron-LM/megatron/core/models/gpt/gpt_model.py", line 183, in forward
hidden_states = self.decoder(
File "/envs//megatron/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1529, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/envs//megatron/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl
return forward_call(*args, **kwargs)
File "/envs//Megatron-LM/megatron/core/transformer/transformer_block.py", line 382, in forward
hidden_states, context = layer(
File "/envs//megatron/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1529, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/envs//megatron/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl
return forward_call(*args, **kwargs)
File "/envs//Megatron-LM/megatron/core/transformer/transformer_layer.py", line 176, in forward
attention_output_with_bias = self.self_attention(
File "/envs//megatron/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1529, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/envs//megatron/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl
return forward_call(*args, **kwargs)
File "/envs//Megatron-LM/megatron/core/transformer/attention.py", line 305, in forward
core_attn_out = self.core_attention(
File "/envs//megatron/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1529, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/envs//megatron/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl
return forward_call(*args, **kwargs)
File "/envs//Megatron-LM/megatron/core/transformer/custom_layers/transformer_engine.py", line 466, in forward
core_attn_out = super().forward(
File "/envs//megatron/lib/python3.10/site-packages/transformer_engine/pytorch/attention.py", line 2552, in forward
qkv_layout, query_layer, key_layer, value_layer = _get_qkv_layout(
File "/envs//megatron/lib/python3.10/site-packages/transformer_engine/pytorch/attention.py", line 1466, in _get_qkv_layout
raise Exception("The provided qkv memory layout is not supported!")
Exception: The provided qkv memory layout is not supported!
I solved the problem by hard-code...
https://github.com/NVIDIA/Megatron-LM/issues/703#issuecomment-1965759788
I found that the reason is because transpose_output_memory was set to True.
No. changing transpose_output_memory to False will cause training raising error.
Marking as stale. No activity in 60 days.
@1049451037 This simply because you were using a m-core model (m-core model has bugs). You switch to legacy model : --use-legacy-model (now m-core is the default)
Marking as stale. No activity in 60 days.