vik issues

Results 17 issues of

vik

Save kwargs in HFClientVLLM

It's not possible to override max_tokens without doing this.

Support running eval scripts on multiple GPUs

I'm starting to implement benchmarking scripts directly in this repository for reproducibility. We should support running them on multiple GPUs when available since they take a fair bit of time...

help wanted

good first issue

[Model] Add moondream vision language model

This pull request adds support for the [moondream](https://github.com/vikhyat/moondream) vision language model, added similar to how the LLaVA implementation works. Usage: ```python llm = LLM( model="vikhyatk/moondream2", trust_remote_code=True, image_input_type="pixel_values", image_token_id=50256, image_input_shape="1,3,378,378", image_feature_size=729,...

performance regression on MPS

Appears to be caused by this: ``` /Users/vikhyat/Coding/moondream/.venv/lib/python3.12/site-packages/transformers/generation/utils.py:1513: UserWarning: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the MPS backend and will fall back to run on the CPU. This...

vik

Save kwargs in HFClientVLLM

Support running eval scripts on multiple GPUs

[Model] Add moondream vision language model

performance regression on MPS

simplify weight loading logic

GPU memory utilization when running in Flask server

Add PIL fallback for image resizing