Isotr0py issues

Results 20 issues of


                                            Isotr0py

[Bugfix][Hardware][CPU] Fix CPU model input for decode

FILL IN THE PR DESCRIPTION HERE FIX #9024 (*link existing issues this PR will resolve*) - Minor fix for `IndexError: list index out of range` on CPU backend **BEFORE SUBMITTING,...

x86 CPU

[Core] Refactor GGUF parameters packing and forwarding

FILL IN THE PR DESCRIPTION HERE FIX #xxxx (*link existing issues this PR will resolve*) This PR aims to refactor the GGUF implementation on merged linear layer (`qkv_proj` and `gate_up_proj`)...

Add support for Phi-3-vision series model

- Add support for Phi-3-vision models ([microsoft/Phi-3-vision-128k-instruct](https://huggingface.co/microsoft/Phi-3-vision-128k-instruct) and [microsoft/Phi-3.5-vision-instruct](https://huggingface.co/microsoft/Phi-3.5-vision-instruct))

[VLM] Support multimodal inputs for Florence-2 models

FIX #5934

[Misc] Avoid calling unnecessary `hf_list_repo_files` for local model path

- There will be error logs due to `hf_list_repo_files` calling when model repo is local: ``` INFO 02-16 13:46:11 __init__.py:190] Automatically detected platform cuda. ERROR 02-16 13:46:11 config.py:102] Error retrieving...

[Bugfix] Fix failing transformers dynamic module resolving with spawn multiproc method

Issue discussion on Slack: https://vllm-dev.slack.com/archives/C07R5Q1Q2BB/p1739776343893149?thread_ts=1739553140.299949&cid=C07R5Q1Q2BB - `transformers` backend failed to load custom module on multiproc executor with `VLLM_WORKER_MULTIPROC_METHOD=spawn` because false-positive loaded custom module. - This PR optimize the automap resolving...

[Model] Enable quantization support for `transformers` backend

- [x] Add BNB support for `transformers` backend - [x] Update the available quantization in `transformers` backend docs.

documentation

ready

needs-rebase

Isotr0py