gpt-neox Error on interactive generation

Describe the bug Setting "text-gen-type": "interactive" results in an IndexError: : shape mismatch: indexing tensors could not be broadcast together with shapes [4], [3]. Other generation types work.

To Reproduce Steps to reproduce the behavior:

Install, adapt 20B to local environment, add "text-gen-type": "interactive" config
Run inference
Enter arbitrary prompt when requested
See error

Expected behavior Should work like non-interactive mode.

Environment (please complete the following information):

GPUs: 4xV100
Configs: 20B + "pipe-parallel-size": 1 + "text-gen-type": "interactive"

Additional context Using ppc64le, so some libraries are not exactly as pinned. Please ignore the issue if it does not occur on more common platforms.

Feb 15 '22 07:02 tonigi

Thank you for the bug report. Can you check that this wasn’t inadvertently caused by #539?

Feb 16 '22 01:02 StellaAthena

Uhm no, I'm misunderstanding something. In both main and https://github.com/EleutherAI/gpt-neox/commit/2189a4f6724770d3087a3a19b75f25bfb73b9a06 , interactive seems to only work with 3-words prompts.

Feb 16 '22 09:02 tonigi

Uhm no, I'm misunderstanding something. In both main and 2189a4f , interactive seems to only work with 3-words prompts.

Ditto for me also.

Feb 16 '22 16:02 Adrian-1234

Traceback (most recent call last):
  File "generate.py", line 88, in <module>
    main()
  File "generate.py", line 71, in main
    generate_samples_interactive(
  File "/workspace/gpt-neox/megatron/text_generation_utils.py", line 745, in generate_samples_interactive
    for (
  File "/workspace/gpt-neox/megatron/text_generation_utils.py", line 311, in stream_tokens
    logits = forward_model(model, model_inputs, neox_args.is_pipe_parallel)
  File "/workspace/gpt-neox/megatron/text_generation_utils.py", line 155, in forward_model
    loss, logits = model.eval_batch(model_inputs, return_logits=True)
  File "/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/pipe/engine.py", line 394, in eval_batch
    self._exec_schedule(sched)
  File "/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/pipe/engine.py", line 1308, in _exec_schedule
    self._exec_instr(**cmd.kwargs)
  File "/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/pipe/engine.py", line 700, in _exec_forward_pass
    self.loss = self.loss_model(outputs, labels)
  File "/workspace/gpt-neox/megatron/model/gpt2_model.py", line 67, in cross_entropy
    losses = mpu.vocab_parallel_cross_entropy(output.float().contiguous(), labels)
  File "/workspace/gpt-neox/megatron/mpu/cross_entropy.py", line 114, in vocab_parallel_cross_entropy
    return _VocabParallelCrossEntropy.apply(vocab_parallel_logits, target)
  File "/workspace/gpt-neox/megatron/mpu/cross_entropy.py", line 60, in forward
    predicted_logits_1d = logits_2d[arange_1d, masked_target_1d]
IndexError: shape mismatch: indexing tensors could not be broadcast together with shapes [27], [3]
Traceback (most recent call last):
  File "generate.py", line 88, in <module>
    main()
  File "generate.py", line 71, in main
    generate_samples_interactive(
  File "/workspace/gpt-neox/megatron/text_generation_utils.py", line 745, in generate_samples_interactive
    for (
  File "/workspace/gpt-neox/megatron/text_generation_utils.py", line 311, in stream_tokens
    logits = forward_model(model, model_inputs, neox_args.is_pipe_parallel)
  File "/workspace/gpt-neox/megatron/text_generation_utils.py", line 155, in forward_model
    loss, logits = model.eval_batch(model_inputs, return_logits=True)
  File "/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/pipe/engine.py", line 394, in eval_batch
    self._exec_schedule(sched)
  File "/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/pipe/engine.py", line 1308, in _exec_schedule
    self._exec_instr(**cmd.kwargs)
  File "/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/pipe/engine.py", line 700, in _exec_forward_pass
    self.loss = self.loss_model(outputs, labels)
  File "/workspace/gpt-neox/megatron/model/gpt2_model.py", line 67, in cross_entropy
    losses = mpu.vocab_parallel_cross_entropy(output.float().contiguous(), labels)
  File "/workspace/gpt-neox/megatron/mpu/cross_entropy.py", line 114, in vocab_parallel_cross_entropy
    return _VocabParallelCrossEntropy.apply(vocab_parallel_logits, target)
  File "/workspace/gpt-neox/megatron/mpu/cross_entropy.py", line 60, in forward
    predicted_logits_1d = logits_2d[arange_1d, masked_target_1d]
IndexError: shape mismatch: indexing tensors could not be broadcast together with shapes [27], [3]

Mar 23 '22 18:03 slash-under

Newbie alert ! - A quick test input-file & interactive run across 8 GPU's :

cross_entropy.py Line 60:

    print ("DEBUG ",arange_1d, masked_target_1d)

    predicted_logits_1d = logits_2d[arange_1d, masked_target_1d]

For "text-gen-type": "input-file",

DEBUG tensor([0, 1, 2, 3, 4, 5, 6, 7, 8], device='cuda:6') tensor([ 58, 46434, 0], device='cuda:6')

DEBUG tensor([0, 1, 2, 3, 4, 5, 6, 7, 8], device='cuda:7') tensor([0, 0, 0], device='cuda:7')

For "text-gen-type": "interactive",

DEBUG DEBUG tensor([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,

    18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,

    36, 37, 38], device='cuda:6') tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,

    18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,

    36, 37, 38], device='cuda:7') tensor([  513,   417,  1158,   368,   476,  1014,  3812,   253,  1386,   670,

     3347, 47301,    13,  2167,   253,  5301,   310,  4931,   247,  1652,

     2372, 16593,   984,   247,  3347,  3024,  3542,   407,   247,  5145,

      588,   320,  1805, 14109,   407,  1529,  5145,    15,     0],

   device='cuda:6')

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], device='cuda:7')

DEBUG DEBUG tensor([0], device='cuda:7') tensor([0], device='cuda:7')