SangBin Cho comments

Results 292 comments of


                                            SangBin Cho

Error: failed to get os threadid

I found when I don't specify this is returned ``` Thread 0x7FB1278F5740 (active): "MainThread" main_loop (ray/_private/worker.py:763) (ray/_private/workers/default_worker.py:233) Thread 860 (idle): "ray_import_thread" wait (threading.py:300) _wait_once (grpc/_common.py:106) wait (grpc/_common.py:148) result (grpc/_channel.py:735) _poll_locked...

use too much CPU resource

Has this issue been resolved? I am observing this behavior from https://github.com/ray-project/ray/ when we run gpustat.new_query() repetitively at GCE. ![profile](https://github.com/wookayin/gpustat/assets/18510752/5db64f3b-b3a7-4e4a-ac6b-93e603159ac4)

use too much CPU resource

Lots of time is spent on NvmlInit & shutdown & nvmlDeviceGetHandleByIndex

[RuntimeEnv] debug log level can not be set.

Duplicated as https://github.com/ray-project/ray/issues/29758

[Usage]: How to use vLLM with `Tensor` input (customized tokenizer).

I believe if you implement tokenizer class that works with https://github.com/vllm-project/vllm/blob/3492859b687ba18db47720bcf6f07289999a2df5/vllm/transformers_utils/tokenizer_group/tokenizer_group.py#L42 this API, you can use https://github.com/vllm-project/vllm/blob/3492859b687ba18db47720bcf6f07289999a2df5/vllm/entrypoints/llm.py#L118 to set tokenizer.

Openllm with vLLM backend VS vLLM in handling group of requests at the same time

Is this the same when you compare the same version vllm?

Openllm with vLLM backend VS vLLM in handling group of requests at the same time

Can you try with lower version OSS vllm in this case?

Openllm with vLLM backend VS vLLM in handling group of requests at the same time

oh, I meant to use 0.2.7 for "2" (since you cannot upgrade openllm).

Openllm with vLLM backend VS vLLM in handling group of requests at the same time

I will take a look at it soon!

Openllm with vLLM backend VS vLLM in handling group of requests at the same time

How can I reproduce this?