Roger Wang comments

Results 132 comments of


                                            Roger Wang

Loading models from an S3 location instead of local path

Similar to what @ikalista mentioned in original discussion, imo a better way is to mount a model storage to the container for model loading unless we want to rewrite the...

Loading models from an S3 location instead of local path

> @ywang96 is anybody working on the direct model loading, do we have a benchmark between mounting and directly loading to memory? Happy to work on this if nobody else...

[Bug]: Error in benchmark model with vllm backend for endpoint /v1/chat/completions

@rabaja Can you share what's inside `./benchmark_serving.sh`? I cannot repro this with our benchmark script in the main branch. my server launch command: ``` vllm serve meta-llama/Llama-3.1-8B-Instruct ``` Benchmark launch...

[Bug]: Error in benchmark model with vllm backend for endpoint /v1/chat/completions

It would be great if you can clone the latest main branch and just confirm that the benchmark script works for you.

[V1][Performance] Implement custom serializaton for MultiModalKwargs [Rebased]

@p88h This is amazing! Have you tried running some benchmarks to see the throughput performance impact of this PR?

[Bug]: Benchmark v1 on multi-gpu crashes with ValueError: Pointer argument (at 0) cannot be accessed from Triton

cc @youkaichao

[Multimodal] Optimize Qwen2/2.5-VL startup time

@DarkLight1337 @WoosukKwon Here's a short repro script - let me know if this is reasonable. ```python import time from vllm import LLM st = time.perf_counter() llm = LLM(model="Qwen/Qwen2.5-VL-3B-Instruct", enforce_eager=True) print("Time...

Run v1 benchmark and integrate with PyTorch OSS benchmark database

Oops - I forgot to turn on ready label and auto-merge. Doing it now!

[New Model]: allenai/Molmo-7B-0-0924 VisionLM

We're a bit overwhelmed by things to work on, so any help/contribution is definitely welcomed! Supporting this model should be straightforward since it's also LlaVA-style like many other VLMs we...

[New Model]: allenai/Molmo-7B-0-0924 VisionLM

> does the support included in the release 0.6.2? @premg16 0.6.2 has already been released, so no, but we will make a new release when this model is supported by...