Pierre Janeke

Results 10 comments of Pierre Janeke

I see 2. was fixed with https://github.com/sgl-project/sglang/commit/b0890631a011be28d5ef5a0b4d5551fdeb94ab25

Does this mean the problem with 1. is fixed @merrymercy?

@rlouf did you manage to make much progress yet?

I had a similar problem running on an EC2 g5.2xlarge instance (1 x A10G) using openchat/openchat3.5-0106. I have long sequences (6-7k tokens). A batch size of 19 sequences is fine,...

@hnyls2002 is it possible to launch 8 servers (one for each GPU) on a single machine with 8GPUs?

I know this results in a full copy of the model being on each machine, but that is ideal for my use case. Apparently, you can do it with vllm...

Is this happening soon?

@MightyGoldenJA I think you can use the outlines integration in vllm and pass it as an argument to the vllm integration in langchain (I hope I used the right phrasing)....

I am not very familiar with these libraries but how about what [aiopath](https://github.com/alexdelorenzo/aiopath) and [aiobotocore](https://github.com/aio-libs/aiobotocore) did? Perhaps they could be a source of inspiration if someone is willing to put...