gpt-neox icon indicating copy to clipboard operation
gpt-neox copied to clipboard

Error on interactive generation

Open tonigi opened this issue 3 years ago • 10 comments

Describe the bug Setting "text-gen-type": "interactive" results in an IndexError: : shape mismatch: indexing tensors could not be broadcast together with shapes [4], [3]. Other generation types work.

To Reproduce Steps to reproduce the behavior:

  1. Install, adapt 20B to local environment, add "text-gen-type": "interactive" config
  2. Run inference
  3. Enter arbitrary prompt when requested
  4. See error

Expected behavior Should work like non-interactive mode.

Environment (please complete the following information):

  • GPUs: 4xV100
  • Configs: 20B + "pipe-parallel-size": 1 + "text-gen-type": "interactive"

Additional context Using ppc64le, so some libraries are not exactly as pinned. Please ignore the issue if it does not occur on more common platforms.

tonigi avatar Feb 15 '22 07:02 tonigi

Thank you for the bug report. Can you check that this wasn’t inadvertently caused by #539?

StellaAthena avatar Feb 16 '22 01:02 StellaAthena

Uhm no, I'm misunderstanding something. In both main and https://github.com/EleutherAI/gpt-neox/commit/2189a4f6724770d3087a3a19b75f25bfb73b9a06 , interactive seems to only work with 3-words prompts.

tonigi avatar Feb 16 '22 09:02 tonigi

Uhm no, I'm misunderstanding something. In both main and 2189a4f , interactive seems to only work with 3-words prompts.

Ditto for me also.

Adrian-1234 avatar Feb 16 '22 16:02 Adrian-1234

Traceback (most recent call last):
  File "generate.py", line 88, in <module>
    main()
  File "generate.py", line 71, in main
    generate_samples_interactive(
  File "/workspace/gpt-neox/megatron/text_generation_utils.py", line 745, in generate_samples_interactive
    for (
  File "/workspace/gpt-neox/megatron/text_generation_utils.py", line 311, in stream_tokens
    logits = forward_model(model, model_inputs, neox_args.is_pipe_parallel)
  File "/workspace/gpt-neox/megatron/text_generation_utils.py", line 155, in forward_model
    loss, logits = model.eval_batch(model_inputs, return_logits=True)
  File "/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/pipe/engine.py", line 394, in eval_batch
    self._exec_schedule(sched)
  File "/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/pipe/engine.py", line 1308, in _exec_schedule
    self._exec_instr(**cmd.kwargs)
  File "/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/pipe/engine.py", line 700, in _exec_forward_pass
    self.loss = self.loss_model(outputs, labels)
  File "/workspace/gpt-neox/megatron/model/gpt2_model.py", line 67, in cross_entropy
    losses = mpu.vocab_parallel_cross_entropy(output.float().contiguous(), labels)
  File "/workspace/gpt-neox/megatron/mpu/cross_entropy.py", line 114, in vocab_parallel_cross_entropy
    return _VocabParallelCrossEntropy.apply(vocab_parallel_logits, target)
  File "/workspace/gpt-neox/megatron/mpu/cross_entropy.py", line 60, in forward
    predicted_logits_1d = logits_2d[arange_1d, masked_target_1d]
IndexError: shape mismatch: indexing tensors could not be broadcast together with shapes [27], [3]
Traceback (most recent call last):
  File "generate.py", line 88, in <module>
    main()
  File "generate.py", line 71, in main
    generate_samples_interactive(
  File "/workspace/gpt-neox/megatron/text_generation_utils.py", line 745, in generate_samples_interactive
    for (
  File "/workspace/gpt-neox/megatron/text_generation_utils.py", line 311, in stream_tokens
    logits = forward_model(model, model_inputs, neox_args.is_pipe_parallel)
  File "/workspace/gpt-neox/megatron/text_generation_utils.py", line 155, in forward_model
    loss, logits = model.eval_batch(model_inputs, return_logits=True)
  File "/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/pipe/engine.py", line 394, in eval_batch
    self._exec_schedule(sched)
  File "/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/pipe/engine.py", line 1308, in _exec_schedule
    self._exec_instr(**cmd.kwargs)
  File "/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/pipe/engine.py", line 700, in _exec_forward_pass
    self.loss = self.loss_model(outputs, labels)
  File "/workspace/gpt-neox/megatron/model/gpt2_model.py", line 67, in cross_entropy
    losses = mpu.vocab_parallel_cross_entropy(output.float().contiguous(), labels)
  File "/workspace/gpt-neox/megatron/mpu/cross_entropy.py", line 114, in vocab_parallel_cross_entropy
    return _VocabParallelCrossEntropy.apply(vocab_parallel_logits, target)
  File "/workspace/gpt-neox/megatron/mpu/cross_entropy.py", line 60, in forward
    predicted_logits_1d = logits_2d[arange_1d, masked_target_1d]
IndexError: shape mismatch: indexing tensors could not be broadcast together with shapes [27], [3]

slash-under avatar Mar 23 '22 18:03 slash-under

Newbie alert ! - A quick test input-file & interactive run across 8 GPU's :

cross_entropy.py Line 60:

    print ("DEBUG ",arange_1d, masked_target_1d)

    predicted_logits_1d = logits_2d[arange_1d, masked_target_1d]

For "text-gen-type": "input-file",

DEBUG tensor([0, 1, 2, 3, 4, 5, 6, 7, 8], device='cuda:6') tensor([ 58, 46434, 0], device='cuda:6')

DEBUG tensor([0, 1, 2, 3, 4, 5, 6, 7, 8], device='cuda:7') tensor([0, 0, 0], device='cuda:7')

For "text-gen-type": "interactive",

DEBUG DEBUG tensor([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,

    18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,

    36, 37, 38], device='cuda:6') tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,

    18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,

    36, 37, 38], device='cuda:7') tensor([  513,   417,  1158,   368,   476,  1014,  3812,   253,  1386,   670,

     3347, 47301,    13,  2167,   253,  5301,   310,  4931,   247,  1652,

     2372, 16593,   984,   247,  3347,  3024,  3542,   407,   247,  5145,

      588,   320,  1805, 14109,   407,  1529,  5145,    15,     0],

   device='cuda:6')

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], device='cuda:7')

DEBUG DEBUG tensor([0], device='cuda:7') tensor([0], device='cuda:7')

tensor([0], device='cuda:6') tensor([0], device='cuda:6')

DEBUG tensor([0], device='cuda:7') DEBUG tensor([0], device='cuda:7')

tensor([0], device='cuda:6') tensor([0], device='cuda:6')

DEBUG DEBUG tensor([0], device='cuda:7') tensor([0], device='cuda:7')

tensor([0], device='cuda:6') tensor([0], device='cuda:6')

DEBUG DEBUG tensor([0], device='cuda:7') tensor([0], device='cuda:7')

tensor([0], device='cuda:6') tensor([0], device='cuda:6')

DEBUG tensor([0], device='cuda:7') DEBUG tensor([0], device='cuda:7')

tensor([0], device='cuda:6') tensor([0], device='cuda:6')

DEBUG DEBUG tensor([0], device='cuda:6') tensor([0], device='cuda:7') tensor([0], device='cuda:6')

I don't understand enough about how this has been written but is it something to do with the large number of Tensor elements in the interactive mode - caused perhaps by the code not properly dimensioning the tensor array with the interactive input ?

Adrian-1234 avatar Mar 27 '22 11:03 Adrian-1234

With that line added on my end...

Context prompt >>> this cat
DEBUG  tensor([0, 1], device='cuda:3') tensor([0, 0, 0], device='cuda:3')
DEBUG  tensor([0, 1], device='cuda:2') tensor([   58, 46434,     0], device='cuda:2')

The same prompt, passed in as a file:

DEBUG  tensor([0, 1], device='cuda:3') tensor([0, 0], device='cuda:3')
DEBUG  tensor([0, 1], device='cuda:2') tensor([5798,    0], device='cuda:2')

These tensors don't look dimensioned properly to me for smaller inputs as well, looks like we might have the root cause outlined.

slash-under avatar Mar 30 '22 15:03 slash-under

See #604 for the greater prototyping effort underway to ensure that all processes have the correct context_length and context_tokens.

slash-under avatar Apr 03 '22 18:04 slash-under

This is resolved

slash-under avatar Jun 12 '22 16:06 slash-under

I'm observing behavior similar to this issue. Whenever I enter an interactive input longer than three tokens I receive an error like this.

Context prompt >>> Are humans born with virtue?
Traceback (most recent call last):
  File "generate.py", line 88, in <module>
    main()
  File "generate.py", line 71, in main
    generate_samples_interactive(
  File "/home/mchorse/gpt-neox/megatron/text_generation_utils.py", line 746, in generate_samples_interactive
    for (
  File "/home/mchorse/gpt-neox/megatron/text_generation_utils.py", line 312, in stream_tokens
    logits = forward_model(model, model_inputs, neox_args.is_pipe_parallel)
  File "/home/mchorse/gpt-neox/megatron/text_generation_utils.py", line 155, in forward_model
    loss, logits = model.eval_batch(model_inputs, return_logits=True)
  File "/home/mchorse/anaconda3/envs/kyle1668/lib/python3.8/site-packages/deepspeed/runtime/pipe/engine.py", line 394, in eval_batch
    self._exec_schedule(sched)
  File "/home/mchorse/anaconda3/envs/kyle1668/lib/python3.8/site-packages/deepspeed/runtime/pipe/engine.py", line 1308, in _exec_schedule
    self._exec_instr(**cmd.kwargs)
  File "/home/mchorse/anaconda3/envs/kyle1668/lib/python3.8/site-packages/deepspeed/runtime/pipe/engine.py", line 700, in _exec_forward_pass
    self.loss = self.loss_model(outputs, labels)
  File "/home/mchorse/gpt-neox/megatron/model/gpt2_model.py", line 67, in cross_entropy
    losses = mpu.vocab_parallel_cross_entropy(output.float().contiguous(), labels)
  File "/home/mchorse/gpt-neox/megatron/mpu/cross_entropy.py", line 114, in vocab_parallel_cross_entropy
    return _VocabParallelCrossEntropy.apply(vocab_parallel_logits, target)
  File "/home/mchorse/gpt-neox/megatron/mpu/cross_entropy.py", line 60, in forward
    predicted_logits_1d = logits_2d[arange_1d, masked_target_1d]
IndexError: shape mismatch: indexing tensors could not be broadcast together with shapes [6], [3]

I don't receive any errors when the length of the input does not exceed three tokens.

Context prompt Is virtue innate
Generated Text: ?
Generated Text: ? Can
Generated Text: ? Can '
Generated Text: ? Can 'nob
Generated Text: ? Can 'noble
Generated Text: ? Can 'noble sentiment
Generated Text: ? Can 'noble sentiment,'
Generated Text: ? Can 'noble sentiment,' '
Generated Text: ? Can 'noble sentiment,' 'just
Generated Text: ? Can 'noble sentiment,' 'just pride
Generated Text: ? Can 'noble sentiment,' 'just pride'
Generated Text: ? Can 'noble sentiment,' 'just pride' be
Generated Text: ? Can 'noble sentiment,' 'just pride' be fost
Generated Text: ? Can 'noble sentiment,' 'just pride' be fostered
Generated Text: ? Can 'noble sentiment,' 'just pride' be fostered in
Generated Text: ? Can 'noble sentiment,' 'just pride' be fostered in every
...
Generated Text: ? Can 'noble sentiment,' 'just pride' be fostered in every human being by wholesome influences, moral laws, sound education, the workings of love? Miss Edgeworth says, no; that society is 'inert, boys mischievous, parents fanatical, children hopeless'; that much remains to be done, 'if admiration and emulation be to be our next goal in heaven; our saints and philosophers, our poets and warriors, are there, perhaps, only a preparation

Environment

  • 8x NVIDIA A40
  • Ubuntu 20.04

Kyle1668 avatar Jun 30 '22 16:06 Kyle1668

I think I'm also experiencing this. Any interactive prompt larger than 3 words has issues.

jdagdelen avatar Jul 16 '22 04:07 jdagdelen