Isotr0py

Results 139 comments of Isotr0py

You can put it in `_EMBEDDING_MODELS`.

#19872 should fix the CI deadlock, you can merge from main to get the fix.

See: https://docs.vllm.ai/en/latest/models/supported_models.html#id2 ``` ^ You need to set the architecture name via --hf-overrides to match the one in vLLM. • For example, to use DeepSeek-VL2 series models: --hf-overrides '{"architectures": ["DeepseekVLV2ForCausalLM"]}'...

Seems just the network timeout when fetching the test video, retrying.

That's because the input image size is too large, causing the number of image tokens larger than `max_position_embeddings=4096` (the default config value for deepseek-vl2). The model should work if you...

Oh, seems that there are some issues in deepseek-vl2's hf_processor, let me investigate it deeper tonight.

Seems this field was initially added to CPU executor in the earliest CPU support PR (#3634) with declaration missing in `CacheConfig`. Since everything worked well after that, nobody catched this...

I'm not sure which quantization "int4 quantization" exactly means here, because seems that there is no BNB 4-bit quantized Phi3-V model released in HF. (The code given above is using...

It costs about 4GB VRAM to run 4-bit awq quantized Phi-3.5-vision-instruct. BTW, the AWQ model I uploaded is calibrated with default dataset in `autoawq`, because I just used it to...

> Not sure where is the best place to put this part. I'm OK to move this to somewhere else. Seems that the `Nginx Loadbalancer` part is put to `Getting...