Farbod Bijary

Results 9 comments of Farbod Bijary

> * did not understand deleting multi playlist. You can select multiple playlists and delete them at once. - I created two empty playlists selected both and tried to delete...

@compilade You're right using `GGMLQuantizationType` is kind of more readable and idempotent. I tested your suggested fix (created a patch and applied it), and it fixes this issue. I used...

Is the [failed CI check](https://github.com/ggerganov/llama.cpp/actions/runs/10333572638/job/28606150773?pr=8928) required for merging this PR, do I need to do anything about it? as it does not seem to be related to this PR.

@compilade @ggerganov Since This PR has been approved and labeled as `merge-ready` but has had no activity in the past 5 days, it came to me to prevent it from...

> My concern is that batching may not increase speed. > > However, maybewe should implement a queue interface for users who need to submit multiple requests in a sequence....

> I’m not sure: do you want to batch multiple requests, or batch multiple prompts within a single request? my end to end usecase is to be run the following...

> > The problem is batching may not yield performance gain. For example(qwen-iamge): > > 1 prompt per request: 66s 2 prompts per request: 131s > > Since it's already...

> I recommend you to read this RFC #290 Thanks. I have performed the benchmark, you can see that from 9:40 until 10 I enabled batch mode (batch_size = 4)...

I am eager to contribute this feature for [pipeline_qwen_image.py](https://github.com/vllm-project/vllm-omni/blob/96e4690013161b84a70b626fa787948ea4f6ab09/vllm_omni/diffusion/models/qwen_image/pipeline_qwen_image.py) as I currently need it and therefore have a working version of the implementation locally