nullpointer0xffff comments

Results 6 comments of


                                            nullpointer0xffff

"CONFIRMING MATCH BUG"

Same here, I can't believe USE now punished me on leaving the "infinite confirming match"! ![image](https://github.com/ValveSoftware/Dota2-Gameplay/assets/127102263/6520f9b9-8aad-4db8-add8-05a48668a32c)

[Misc]: Throughput/Latency for guided_json with ~100% GPU cache utilization

+1, it seems not GPU related, I tested with A100 / V100 GPUs both have similar issue. Using line profiler, I found this [get_guided_decoding_logits_processor](https://github.com/vllm-project/vllm/blob/63575bc2e197b85ce1c911421ff30c5459e35e9c/vllm/entrypoints/openai/serving_completion.py#L96-L98) call takes 93% time

[Misc]: Throughput/Latency for guided_json with ~100% GPU cache utilization

> If testing lm-format-enforcer, I highly recommend adding the latest version of it to the image, as there have been performance improvements to the JsonSchemaParser. The next version of vLLM...

[Misc]: Throughput/Latency for guided_json with ~100% GPU cache utilization

> > If testing lm-format-enforcer, I highly recommend adding the latest version of it to the image, as there have been performance improvements to the JsonSchemaParser. The next version of...

[Misc]: Throughput/Latency for guided_json with ~100% GPU cache utilization

@noamgat here's a profling when I use lm-format-enforcer 0.10.1. ``` /lib/python3.10/site-packages/lmformatenforcer/integrations/transformers.py Function: _build_regular_tokens_list at line 58 Line # Hits Time Per Hit % Time Line Contents ============================================================== 58 @profile 59...

Support for Constrained decoding

+1 to support `logit_bias` and allow libraries like guidance to utilize. Though there's a workaround to use vLLM API Server to mock ChatGPT API and use guidance openAI client to...