Alexandre Strube comments

Results 193 comments of


                                            Alexandre Strube

add support for bedrock, togetherai, huggingface tgi, replicate, ai21, cohere, ai21

This makes a lot of sense. With this, @krrishdholakia do you think we could have a single worker instead of our model_worker, sglang_worker and vllm_worker and then point to the...

add support for bedrock, togetherai, huggingface tgi, replicate, ai21, cohere, ai21

@krrishdholakia could you add the parameter as suggested by @Ying1123 ? It would be super cool to have this feature here!

add seed params support from vllm_worker

@infwinston what do you think? Makes perfect sense to me!

"SyntaxWarning: invalid escape sequence \("

Still in august :-)

Integrate vllm Error, TypeError: top_k must be an integer, got float

I can confirm. Changing from 1.0 to 1 fixes this.

Hardcoded host localhost and port 9090 for a rate monitor

@infwinston yea, but even so, there's no port attributed to the call_monitor.py, right?

【bug】KeyError: 'factor' when use Triplex

This is the same for phi-3.5

[Feature]: Support Gemma3 GGUF

The OP has this at the last line ```python File "/home/hackey/AI/vllm/venv/lib/python3.12/site-packages/transformers/modeling_gguf_pytorch_utils.py", line 399, in load_gguf_checkpoint raise ValueError(f"GGUF model with architecture {architecture} is not supported yet.") ValueError: GGUF model with architecture...

support for 4bit quantization from transfomer library.

@02shanks absolutely!!!!

support for 4bit quantization from transfomer library.

Well, the usual: - fork the repo, - branch it into a relevant name, - and contribute ONLY those changes related to the issue. - keep the repo up-to-date with...