Megatron-LM icon indicating copy to clipboard operation
Megatron-LM copied to clipboard

[BUG] text generation not working for --position-embedding-type rope

Open 1049451037 opened this issue 11 months ago • 6 comments

When I use the official examples/run_text_generation scripts, I can run normally without rope. But when I use rope, the text generation will raise 500 error:

    return self._call_impl(*args, **kwargs)
  File "/envs//megatron/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl
    return forward_call(*args, **kwargs)
  File "/envs//Megatron-LM/megatron/core/models/gpt/gpt_model.py", line 183, in forward
    hidden_states = self.decoder(
  File "/envs//megatron/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1529, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/envs//megatron/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl
    return forward_call(*args, **kwargs)
  File "/envs//Megatron-LM/megatron/core/transformer/transformer_block.py", line 382, in forward
    hidden_states, context = layer(
  File "/envs//megatron/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1529, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/envs//megatron/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl
    return forward_call(*args, **kwargs)
  File "/envs//Megatron-LM/megatron/core/transformer/transformer_layer.py", line 176, in forward
    attention_output_with_bias = self.self_attention(
  File "/envs//megatron/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1529, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/envs//megatron/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl
    return forward_call(*args, **kwargs)
  File "/envs//Megatron-LM/megatron/core/transformer/attention.py", line 305, in forward
    core_attn_out = self.core_attention(
  File "/envs//megatron/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1529, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/envs//megatron/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl
    return forward_call(*args, **kwargs)
  File "/envs//Megatron-LM/megatron/core/transformer/custom_layers/transformer_engine.py", line 466, in forward
    core_attn_out = super().forward(
  File "/envs//megatron/lib/python3.10/site-packages/transformer_engine/pytorch/attention.py", line 2552, in forward
    qkv_layout, query_layer, key_layer, value_layer = _get_qkv_layout(
  File "/envs//megatron/lib/python3.10/site-packages/transformer_engine/pytorch/attention.py", line 1466, in _get_qkv_layout
    raise Exception("The provided qkv memory layout is not supported!")
Exception: The provided qkv memory layout is not supported!

1049451037 avatar Feb 26 '24 07:02 1049451037

I solved the problem by hard-code...

https://github.com/NVIDIA/Megatron-LM/issues/703#issuecomment-1965759788

1049451037 avatar Feb 27 '24 05:02 1049451037

I found that the reason is because transpose_output_memory was set to True.

leizhao1234 avatar Feb 27 '24 09:02 leizhao1234

No. changing transpose_output_memory to False will cause training raising error.

1049451037 avatar Feb 28 '24 07:02 1049451037

Marking as stale. No activity in 60 days.

github-actions[bot] avatar Apr 28 '24 18:04 github-actions[bot]

@1049451037 This simply because you were using a m-core model (m-core model has bugs). You switch to legacy model : --use-legacy-model (now m-core is the default)

Marking as stale. No activity in 60 days.

github-actions[bot] avatar Sep 07 '24 18:09 github-actions[bot]