xgrammar
xgrammar copied to clipboard
Fix warp_size in triton kernel for AMD GPUs
This fix resolves triton.runtime.errors.OutOfResources error on AMD GPUs (mi300).
Here's the error log without this fix:
File "/Projects/VLLM_DIR/vllm/vllm/v1/worker/gpu_model_runner.py", line 2934, in sample_tokens
apply_grammar_bitmask(
File "/Projects/VLLM_DIR/vllm/vllm/v1/structured_output/utils.py", line 126, in apply_grammar_bitmask
xgr.apply_token_bitmask_inplace(logits, grammar_bitmask, indices=index_tensor)
File "/usr/local/lib/python3.12/dist-packages/xgrammar/matcher.py", line 147, in apply_token_bitmask_inplace
apply_token_bitmask_inplace_triton(logits, bitmask, vocab_size, indices)
File "/usr/local/lib/python3.12/dist-packages/xgrammar/kernels/apply_token_bitmask_inplace_triton.py", line 106, in apply_token_bitmask_inplace_triton
apply_token_bitmask_inplace_kernel[grid](
File "/usr/local/lib/python3.12/dist-packages/triton/runtime/jit.py", line 393, in <lambda>
return lambda *args, **kwargs: self.run(grid=grid, warmup=False, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/triton/runtime/jit.py", line 623, in run
kernel.run(grid_0, grid_1, grid_2, stream, kernel.function, kernel.packed_metadata, launch_metadata,
^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/triton/compiler/compiler.py", line 467, in __getattribute__
self._init_handles()
File "/usr/local/lib/python3.12/dist-packages/triton/compiler/compiler.py", line 461, in _init_handles
raise OutOfResources(self.metadata.num_warps * warp_size, self.n_max_threads, "threads")
triton.runtime.errors.OutOfResources: out of resource: threads, Required: 2048, Hardware limit: 1024. Reducing block sizes or `num_stages` may help.
@Ubospica @mgorny Looking for a review.
@Seven-Streams @southfreebird Looking for a review
It looks good to me. Supporting AMD GPUs is very meaningful. Thanks, and I am supportive of merging it!
Hi @Ubospica I see all of the checks have passed, could this get merged now? Thanks!
@divakar-amd @micah-wil I have merged it. Thanks for the contribution!