Isotr0py comments

Results 139 comments of


                                            Isotr0py

[Model] GPT2ForSequenceClassification model

You can put it in `_EMBEDDING_MODELS`.

[Model] GPT2ForSequenceClassification model

#19872 should fix the CI deadlock, you can merge from main to get the fix.

[New Model]: No supported config format found in deepseek-vl2-small

See: https://docs.vllm.ai/en/latest/models/supported_models.html#id2 ``` ^ You need to set the architecture name via --hf-overrides to match the one in vLLM. • For example, to use DeepSeek-VL2 series models: --hf-overrides '{"architectures": ["DeepseekVLV2ForCausalLM"]}'...

[New model support]Support Tarsier2

Seems just the network timeout when fetching the test video, retrying.

[Bug]: ValueError: Attempted to assign 421 + 421 + 421 + 421 + 421 + 421 = 2526 multimodal tokens to 12606 placeholders

That's because the input image size is too large, causing the number of image tokens larger than `max_position_embeddings=4096` (the default config value for deepseek-vl2). The model should work if you...

[Bug]: ValueError: Attempted to assign 421 + 421 + 421 + 421 + 421 + 421 = 2526 multimodal tokens to 12606 placeholders

Oh, seems that there are some issues in deepseek-vl2's hf_processor, let me investigate it deeper tonight.

[MISC] add cpu_kvcache_space_bytes to CacheConfig

Seems this field was initially added to CPU executor in the earliest CPU support PR (#3634) with declaration missing in `CacheConfig`. Since everything worked well after that, nobody catched this...

[New Model]: We can able to run phi-3.5 vision instruct model but wanted to run in int4 quantization

I'm not sure which quantization "int4 quantization" exactly means here, because seems that there is no BNB 4-bit quantized Phi3-V model released in HF. (The code given above is using...

[New Model]: We can able to run phi-3.5 vision instruct model but wanted to run in int4 quantization

It costs about 4GB VRAM to run 4-bit awq quantized Phi-3.5-vision-instruct. BTW, the AWQ model I uploaded is calibrated with default dataset in `autoawq`, because I just used it to...

[Hardware][Intel CPU][DOC] Update docs for CPU backend

> Not sure where is the best place to put this part. I'm OK to move this to somewhere else. Seems that the `Nginx Loadbalancer` part is put to `Getting...