gpt-fast Too long input texts cuase device-side assert triggered

Too long input texts cuase device-side assert triggered

Open li-aolong opened this issue 1 year ago • 1 comments

<frozen importlib._bootstrap_external>:843: _call_with_frames_removed: block: [8,0,0], thread: [58,0,0] Assertion `index out of bounds: 0 <= tmp68 < 1504` failed.
<frozen importlib._bootstrap_external>:843: _call_with_frames_removed: block: [8,0,0], thread: [59,0,0] Assertion `index out of bounds: 0 <= tmp68 < 1504` failed.
<frozen importlib._bootstrap_external>:843: _call_with_frames_removed: block: [8,0,0], thread: [60,0,0] Assertion `index out of bounds: 0 <= tmp68 < 1504` failed.
<frozen importlib._bootstrap_external>:843: _call_with_frames_removed: block: [8,0,0], thread: [61,0,0] Assertion `index out of bounds: 0 <= tmp68 < 1504` failed.
<frozen importlib._bootstrap_external>:843: _call_with_frames_removed: block: [8,0,0], thread: [62,0,0] Assertion `index out of bounds: 0 <= tmp68 < 1504` failed.
<frozen importlib._bootstrap_external>:843: _call_with_frames_removed: block: [8,0,0], thread: [63,0,0] Assertion `index out of bounds: 0 <= tmp68 < 1504` failed.
...
RuntimeError: CUDA error: device-side assert triggered
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

When I use llama-2-7b-hf and input ten samples, each with a length of around 2048 after tokenization, I still encounter the above error, even though I have set the block size to 4096.

If I shorten the length, it works.

Dec 15 '23 03:12 li-aolong

Any idea on how to use this for long context example? seems max_seq_len >2048 triggered above error

Mar 20 '24 03:03 KaixiangLin

gpt-fast gpt-fast copied to clipboard

Too long input texts cuase device-side assert triggered

gpt-fast
gpt-fast copied to clipboard