Aaron Pham

[email protected]

Results 429 comments of


                                            Aaron Pham

trafficstars

Is there any way to make images smaller?

This probably has to do with the base image also includes vllm and all different dependencies for vllm

bug: Error while serializing: IoError(Os { code: 28, kind: StorageFull, message: "No space left on device" })

Hi there, the vllm backend is not yet supported with adapters.

bug: 40GB is not enough for llama-2-7b

How many GPU do you have?

bug: 40GB is not enough for llama-2-7b

Can you also send the whole stack trace from the server?

bug: 40GB is not enough for llama-2-7b

So did the model startup correctly? i was able to run llama-2-7b with 1 T4

bug: 40GB is not enough for llama-2-7b

This is consequent requests right?

bug: 40GB is not enough for llama-2-7b

Yes so this is currently a bug that has also been reported else where, I'm taking a look atm.

bug: 40GB is not enough for llama-2-7b

btw you can change the `max_new_tokens` per request. I'm going to change the environment variable changes soon

bug: 40GB is not enough for llama-2-7b

You can try `--quantize gptq` for now Just have a lot of priority atm

bug: 40GB is not enough for llama-2-7b

@gbmarc1 can you try again with 0.3.5?

‹
1
2
...
20
21
22
23
24
25
26
...
42
43
›