vik
vik
It's not possible to override max_tokens without doing this.
I'm starting to implement benchmarking scripts directly in this repository for reproducibility. We should support running them on multiple GPUs when available since they take a fair bit of time...
This pull request adds support for the [moondream](https://github.com/vikhyat/moondream) vision language model, added similar to how the LLaVA implementation works. Usage: ```python llm = LLM( model="vikhyatk/moondream2", trust_remote_code=True, image_input_type="pixel_values", image_token_id=50256, image_input_shape="1,3,378,378", image_feature_size=729,...
Appears to be caused by this: ``` /Users/vikhyat/Coding/moondream/.venv/lib/python3.12/site-packages/transformers/generation/utils.py:1513: UserWarning: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the MPS backend and will fall back to run on the CPU. This...
This is to prepare for adding weight quantization support.
Reported on Twitter: https://x.com/HunterPunter_/status/1853203756202312100
This PR adds a fallback to PIL/Pillow for image resizing when PyVips is not available. Changes include: - Modified image_crops.py to use PIL when PyVips import fails - Added comprehensive...