Ibrahim Ahmed
Ibrahim Ahmed
## Fix streaming requests hanging when structured output FSM fails to advance ### Problem When using structured outputs with the xgrammar backend, streaming requests would hang indefinitely if the FSM...
### Your current environment ``` INFO 05-27 16:20:01 [__init__.py:239] Automatically detected platform cuda. Collecting environment information... PyTorch version: 2.6.0+cu124 Is debug build: False CUDA used to build PyTorch: 12.4 ROCM...
### Describe the bug When downloading a model, the `file_download.py` file does not throw an error when there is not enough space. https://github.com/huggingface/huggingface_hub/blob/2702ec2a2bd0124cc1fddfd72ccb1297b2478148/src/huggingface_hub/file_download.py#L651 This is problematic in environments like sglang,...
Its overly restrictive to force which keys are modifiers and which aren't. A user should be able to define `Alt` as a single key for a hotkey.