Matthias Reso comments

Results 92 comments of


                                            Matthias Reso

Ability to return Pydantic Models

Hi @mhashas sorry for not getting to this sooner. Could you please give a bit more details on how you image that integration? We're using pydantic dataclasses for our vllm...

when to use ArgumentParser, raise "unrecognized arguments: --sock-type unix --sock-name /tmp/.ts.sock.9000"

Hi @james-joobs thanks for reporting your issue. For a bit more context, could you add a complete log showing your error? What platform are you on? Any information on your...

when to use ArgumentParser, raise "unrecognized arguments: --sock-type unix --sock-name /tmp/.ts.sock.9000"

Thanks @james-joobs for the additional information. Now its more clear to me where the issue is. The BaseHandler or a derived class is not executed directly from the cli. Its...

Make Getting Started actually get someone started

Hi @richardkmichael thanks for the contribution, will go through the detailed changes tonight

Loading Model on multiple GPUs

Hi @DerrickYLJ in your torchrun call you need to specify the --nproc_per_node to your number of GPU. It will spin up a process for each GPU to split the model.

Loading Model on multiple GPUs

Sorry @ISADORAyt wasn't paying attention that @DerrickYLJ was loading the 8B model. The code in this repo is only able to load the 8B on a single GPU and the...

Loading Model on multiple GPUs

> I think that the problem is due to Llama3-8B-Instruct only has one checkpoint file? So how does set nproc_per_node will help, or more specifically, how can we solve this?...

throughput increase non-linearly with number of workers

Hi @vandesa003 on the client (l6/locust) side, how many concurrent users/connections do you allow? It looks a bit like you're not providing enough requests to the server and the GPUs...

Checkpoint feature via steps instead of epoch

Hi @mylesgoose I think that could be a great idea. Can you share a bit how the interface would look after this integration?

Checkpoint feature via steps instead of epoch

Great! Could you prepare the checkpointing pieces into a PR. Happy to review this.