gpt-neox
gpt-neox copied to clipboard
Error on interactive generation
Describe the bug
Setting "text-gen-type": "interactive"
results in an IndexError: : shape mismatch: indexing tensors could not be broadcast together with shapes [4], [3]
. Other generation types work.
To Reproduce Steps to reproduce the behavior:
- Install, adapt 20B to local environment, add "text-gen-type": "interactive" config
- Run inference
- Enter arbitrary prompt when requested
- See error
Expected behavior Should work like non-interactive mode.
Environment (please complete the following information):
- GPUs: 4xV100
- Configs: 20B + "pipe-parallel-size": 1 + "text-gen-type": "interactive"
Additional context Using ppc64le, so some libraries are not exactly as pinned. Please ignore the issue if it does not occur on more common platforms.
Thank you for the bug report. Can you check that this wasn’t inadvertently caused by #539?
Uhm no, I'm misunderstanding something. In both main and https://github.com/EleutherAI/gpt-neox/commit/2189a4f6724770d3087a3a19b75f25bfb73b9a06 , interactive seems to only work with 3-words prompts.
Uhm no, I'm misunderstanding something. In both main and 2189a4f , interactive seems to only work with 3-words prompts.
Ditto for me also.
Traceback (most recent call last):
File "generate.py", line 88, in <module>
main()
File "generate.py", line 71, in main
generate_samples_interactive(
File "/workspace/gpt-neox/megatron/text_generation_utils.py", line 745, in generate_samples_interactive
for (
File "/workspace/gpt-neox/megatron/text_generation_utils.py", line 311, in stream_tokens
logits = forward_model(model, model_inputs, neox_args.is_pipe_parallel)
File "/workspace/gpt-neox/megatron/text_generation_utils.py", line 155, in forward_model
loss, logits = model.eval_batch(model_inputs, return_logits=True)
File "/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/pipe/engine.py", line 394, in eval_batch
self._exec_schedule(sched)
File "/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/pipe/engine.py", line 1308, in _exec_schedule
self._exec_instr(**cmd.kwargs)
File "/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/pipe/engine.py", line 700, in _exec_forward_pass
self.loss = self.loss_model(outputs, labels)
File "/workspace/gpt-neox/megatron/model/gpt2_model.py", line 67, in cross_entropy
losses = mpu.vocab_parallel_cross_entropy(output.float().contiguous(), labels)
File "/workspace/gpt-neox/megatron/mpu/cross_entropy.py", line 114, in vocab_parallel_cross_entropy
return _VocabParallelCrossEntropy.apply(vocab_parallel_logits, target)
File "/workspace/gpt-neox/megatron/mpu/cross_entropy.py", line 60, in forward
predicted_logits_1d = logits_2d[arange_1d, masked_target_1d]
IndexError: shape mismatch: indexing tensors could not be broadcast together with shapes [27], [3]
Traceback (most recent call last):
File "generate.py", line 88, in <module>
main()
File "generate.py", line 71, in main
generate_samples_interactive(
File "/workspace/gpt-neox/megatron/text_generation_utils.py", line 745, in generate_samples_interactive
for (
File "/workspace/gpt-neox/megatron/text_generation_utils.py", line 311, in stream_tokens
logits = forward_model(model, model_inputs, neox_args.is_pipe_parallel)
File "/workspace/gpt-neox/megatron/text_generation_utils.py", line 155, in forward_model
loss, logits = model.eval_batch(model_inputs, return_logits=True)
File "/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/pipe/engine.py", line 394, in eval_batch
self._exec_schedule(sched)
File "/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/pipe/engine.py", line 1308, in _exec_schedule
self._exec_instr(**cmd.kwargs)
File "/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/pipe/engine.py", line 700, in _exec_forward_pass
self.loss = self.loss_model(outputs, labels)
File "/workspace/gpt-neox/megatron/model/gpt2_model.py", line 67, in cross_entropy
losses = mpu.vocab_parallel_cross_entropy(output.float().contiguous(), labels)
File "/workspace/gpt-neox/megatron/mpu/cross_entropy.py", line 114, in vocab_parallel_cross_entropy
return _VocabParallelCrossEntropy.apply(vocab_parallel_logits, target)
File "/workspace/gpt-neox/megatron/mpu/cross_entropy.py", line 60, in forward
predicted_logits_1d = logits_2d[arange_1d, masked_target_1d]
IndexError: shape mismatch: indexing tensors could not be broadcast together with shapes [27], [3]
Newbie alert ! - A quick test input-file & interactive run across 8 GPU's :
cross_entropy.py Line 60:
print ("DEBUG ",arange_1d, masked_target_1d)
predicted_logits_1d = logits_2d[arange_1d, masked_target_1d]
For "text-gen-type": "input-file",
DEBUG tensor([0, 1, 2, 3, 4, 5, 6, 7, 8], device='cuda:6') tensor([ 58, 46434, 0], device='cuda:6')
DEBUG tensor([0, 1, 2, 3, 4, 5, 6, 7, 8], device='cuda:7') tensor([0, 0, 0], device='cuda:7')
For "text-gen-type": "interactive",
DEBUG DEBUG tensor([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,
36, 37, 38], device='cuda:6') tensor([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,
36, 37, 38], device='cuda:7') tensor([ 513, 417, 1158, 368, 476, 1014, 3812, 253, 1386, 670,
3347, 47301, 13, 2167, 253, 5301, 310, 4931, 247, 1652,
2372, 16593, 984, 247, 3347, 3024, 3542, 407, 247, 5145,
588, 320, 1805, 14109, 407, 1529, 5145, 15, 0],
device='cuda:6')
tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], device='cuda:7')
DEBUG DEBUG tensor([0], device='cuda:7') tensor([0], device='cuda:7')
tensor([0], device='cuda:6') tensor([0], device='cuda:6')
DEBUG tensor([0], device='cuda:7') DEBUG tensor([0], device='cuda:7')
tensor([0], device='cuda:6') tensor([0], device='cuda:6')
DEBUG DEBUG tensor([0], device='cuda:7') tensor([0], device='cuda:7')
tensor([0], device='cuda:6') tensor([0], device='cuda:6')
DEBUG DEBUG tensor([0], device='cuda:7') tensor([0], device='cuda:7')
tensor([0], device='cuda:6') tensor([0], device='cuda:6')
DEBUG tensor([0], device='cuda:7') DEBUG tensor([0], device='cuda:7')
tensor([0], device='cuda:6') tensor([0], device='cuda:6')
DEBUG DEBUG tensor([0], device='cuda:6') tensor([0], device='cuda:7') tensor([0], device='cuda:6')
I don't understand enough about how this has been written but is it something to do with the large number of Tensor elements in the interactive mode - caused perhaps by the code not properly dimensioning the tensor array with the interactive input ?
With that line added on my end...
Context prompt >>> this cat
DEBUG tensor([0, 1], device='cuda:3') tensor([0, 0, 0], device='cuda:3')
DEBUG tensor([0, 1], device='cuda:2') tensor([ 58, 46434, 0], device='cuda:2')
The same prompt, passed in as a file:
DEBUG tensor([0, 1], device='cuda:3') tensor([0, 0], device='cuda:3')
DEBUG tensor([0, 1], device='cuda:2') tensor([5798, 0], device='cuda:2')
These tensors don't look dimensioned properly to me for smaller inputs as well, looks like we might have the root cause outlined.
See #604 for the greater prototyping effort underway to ensure that all processes have the correct context_length and context_tokens.
This is resolved
I'm observing behavior similar to this issue. Whenever I enter an interactive input longer than three tokens I receive an error like this.
Context prompt >>> Are humans born with virtue?
Traceback (most recent call last):
File "generate.py", line 88, in <module>
main()
File "generate.py", line 71, in main
generate_samples_interactive(
File "/home/mchorse/gpt-neox/megatron/text_generation_utils.py", line 746, in generate_samples_interactive
for (
File "/home/mchorse/gpt-neox/megatron/text_generation_utils.py", line 312, in stream_tokens
logits = forward_model(model, model_inputs, neox_args.is_pipe_parallel)
File "/home/mchorse/gpt-neox/megatron/text_generation_utils.py", line 155, in forward_model
loss, logits = model.eval_batch(model_inputs, return_logits=True)
File "/home/mchorse/anaconda3/envs/kyle1668/lib/python3.8/site-packages/deepspeed/runtime/pipe/engine.py", line 394, in eval_batch
self._exec_schedule(sched)
File "/home/mchorse/anaconda3/envs/kyle1668/lib/python3.8/site-packages/deepspeed/runtime/pipe/engine.py", line 1308, in _exec_schedule
self._exec_instr(**cmd.kwargs)
File "/home/mchorse/anaconda3/envs/kyle1668/lib/python3.8/site-packages/deepspeed/runtime/pipe/engine.py", line 700, in _exec_forward_pass
self.loss = self.loss_model(outputs, labels)
File "/home/mchorse/gpt-neox/megatron/model/gpt2_model.py", line 67, in cross_entropy
losses = mpu.vocab_parallel_cross_entropy(output.float().contiguous(), labels)
File "/home/mchorse/gpt-neox/megatron/mpu/cross_entropy.py", line 114, in vocab_parallel_cross_entropy
return _VocabParallelCrossEntropy.apply(vocab_parallel_logits, target)
File "/home/mchorse/gpt-neox/megatron/mpu/cross_entropy.py", line 60, in forward
predicted_logits_1d = logits_2d[arange_1d, masked_target_1d]
IndexError: shape mismatch: indexing tensors could not be broadcast together with shapes [6], [3]
I don't receive any errors when the length of the input does not exceed three tokens.
Context prompt Is virtue innate
Generated Text: ?
Generated Text: ? Can
Generated Text: ? Can '
Generated Text: ? Can 'nob
Generated Text: ? Can 'noble
Generated Text: ? Can 'noble sentiment
Generated Text: ? Can 'noble sentiment,'
Generated Text: ? Can 'noble sentiment,' '
Generated Text: ? Can 'noble sentiment,' 'just
Generated Text: ? Can 'noble sentiment,' 'just pride
Generated Text: ? Can 'noble sentiment,' 'just pride'
Generated Text: ? Can 'noble sentiment,' 'just pride' be
Generated Text: ? Can 'noble sentiment,' 'just pride' be fost
Generated Text: ? Can 'noble sentiment,' 'just pride' be fostered
Generated Text: ? Can 'noble sentiment,' 'just pride' be fostered in
Generated Text: ? Can 'noble sentiment,' 'just pride' be fostered in every
...
Generated Text: ? Can 'noble sentiment,' 'just pride' be fostered in every human being by wholesome influences, moral laws, sound education, the workings of love? Miss Edgeworth says, no; that society is 'inert, boys mischievous, parents fanatical, children hopeless'; that much remains to be done, 'if admiration and emulation be to be our next goal in heaven; our saints and philosophers, our poets and warriors, are there, perhaps, only a preparation
Environment
- 8x NVIDIA A40
- Ubuntu 20.04
I think I'm also experiencing this. Any interactive prompt larger than 3 words has issues.