Megatron-LM icon indicating copy to clipboard operation
Megatron-LM copied to clipboard

[QUESTION]UnboundLocalError:local variable ‘output tensor’ referenced before assignmnet

Open zmtttt opened this issue 1 year ago • 4 comments

I pretrain Llama3-8B,but met the problem: (1)the configuration : vp2, pp 8, 8gpus, meeting the erros: deallocate_output_tensor(output_tensor, config.deallocate_pipeline_outputs), UnboundLocalError:local variable ‘output tensor’ referenced before assignmnet

(2)but when I change the pp from 8 to 4, it can work well.

why? have someone met the same problem?

zmtttt avatar Dec 05 '24 08:12 zmtttt

I meet same problem, but it's not been solved....

LitPrice avatar Jan 10 '25 02:01 LitPrice

Also encountering the same problem with BERT (using 32 layers, with 32 GPUs, 16 PP stages, 2 num layers per virtual pipeline stage)

bmehta001 avatar Jan 27 '25 23:01 bmehta001

Marking as stale. No activity in 60 days.

github-actions[bot] avatar Mar 29 '25 18:03 github-actions[bot]

I bumped into a similar issue when I mistakenly specified the --num-layers-per-virtual-pipeline-stage larger than intended.

For example,

--num-layers=16
--pipeline-model-parallel-size=4
--num-layers-per-virtual-pipeline-stage=4

lead to virtual_pipeline_model_parallel_size=1, which doesn't seem to be anticipated.

Fixing the value of --num-layers-per-virtual-pipeline-stage to a reasonable value (like 2, in the above case) resolved the issue in my case.

Ktakuya332C avatar May 21 '25 06:05 Ktakuya332C

Marking as stale. No activity in 60 days.

github-actions[bot] avatar Jul 20 '25 18:07 github-actions[bot]

This issue was closed because it has been inactive for 7 days since being marked as stale.

github-actions[bot] avatar Jul 30 '25 02:07 github-actions[bot]