Quang-elec44

Results 17 comments of Quang-elec44
trafficstars

@tianleiwu Thank you for your information. I manage to change my code in order not to add new inputs and I successfully exported my model. However, there are lots of...

@jiguanglizipao I agree with you, it seems that the argument "best_of" does not provide good results. Moreover, in the case of my model, using "do_sample" leads to unwanted results

I got the same problem when running with batch size 64. The server crashed after running a few minutes (any backend failed). There is no problem running with the `vllm:0.6.2`....

It seems that `haystack` does not support parallel execution. I spent time reading the document but currently, there is no solution. btw, @alex-stoica, could you tell me how to visualize...

@alex-stoica Yeah, I read the tutorial but didn't find it useful. I think Haystack lacks dynamic/parallel graph execution, so the team needs to work more on this. Currently, I switch...

@stas00 In my experience, guided generation is always slower than normal. I recommend you try `sglang` instead. Sglang achieves better throughput than vLLM, but the guided generation is still slower.

@ > > @stas00 In my experience, guided generation is always slower than normal. I recommend you try `sglang` instead. Sglang achieves better throughput than vLLM, but the guided generation...