Gintas Z.
Gintas Z.
I think this is a real problem. @hnyls2002 have you tried testing generation with batch size of 100 or 1000 and multi-step structured generation with connection to a remote endpoint?...
@m0g1cian I had solved with this retry logic https://github.com/sgl-project/sglang/pull/424
I'd request to include support for Phi-3-mini
I observe this with `meta-llama/Meta-Llama-3-8B-Instruct`. I think it’s very critical issue
> Also experiencing this with `meta-llama/Meta-Llama-3-8B-Instruct`, this makes the library more or less unusable for me. Which is a shame because I love sglang. I've reverted back to 0.1.14 and...
@m0g1cian I've observed that with some Regex formulations, current versions will get stuck in an infinite generation long beyond max token length, check my other issue: https://github.com/sgl-project/sglang/issues/414
> @Gintasz As of now, I just set `max_tokens` parameter as a safeguard in every `sgl.gen()` to avoid such infinite generation issue. Yes, but with some regex formulations, this 'max_tokens'...
@KMouratidis can you write a full list of commands how did you pull the weights of the mixtral model and passed them to sglang?
@KMouratidis yeah and what was your `$MODEL` value? Because I tried like this below and I got model path name error, as in original post: ``` python3 -m sglang.launch_server --model-path...
I'd totally want this as a raycast plugin.