Stas Bekman
Stas Bekman
I agree, following your suggestion I wrote a hack that enforces a valid json at a cost of abrupt ending. Surely the situation is extreme because I'm using an extremely...
As [Mihai Balint](https://x.com/mbalint) mentioned on twitter, `json_repair` could be another approach as it overcomes the missing structure closure: ``` In [1]: import json_repair In [2]: json_repair.loads('{"a":1,"b":"str"') Out[2]: {'a': 1, 'b':...
so we ended up using `json_repair` to solve this issue.
I think other methods need caching as well - perhaps in another PR if it not a regression? See the first item of https://github.com/vllm-project/vllm/issues/8313#issuecomment-2342297415 - it rebuilds the json schema...
I second that with vllm the cache is being ignored and FSM recompiled on every use. I manually applied your patch and validated it works. Shaves about 3 secs off...
That's a great idea: run the request twice, capture the log, and check that the "Compiling FSM index for all state transitions" message appears only once.
> Seems like it's prone to failure if the message ever changes. "Compiling FSM index for all state transitions" is printed by `outlines` so there should be no problem whatsoever...
As the OP communicates your documentation says that v3 is supported - so you probably need to change it to be more specific - i.e. only v3.0 is supported and...
And I also don't understand why did you close this? > Close since no recent update, please feel free to reopen this issue if needed. Update from whom? You are...
Thank you for the follow up, @laikhtewari I see the confusion now - I think I tested with both 3.1 and 3.2 and both were failing but the issue I...