Isotr0py comments

Results 139 comments of


                                            Isotr0py

[Bugfix] Fix the failing gte embedding test

Hmmm, not sure why the test failed on CI, while it passed locally. Let me try to reproduce the failure on another machine: ``` $ pytest -s -v tests/models/language/pooling/test_embedding.py -k...

[Bugfix] Fix the failing gte embedding test

Hmmm, seems that the problematic outputs are from `hf_runner`, here is the passing test0 outputs locally: ``` Test0: Cosine similarity: 0.9999 hf: array([-0.5522 , -0.01881, -0.438 , 0.87 , -0.04807,...

[Bugfix] Fix the failing gte embedding test

Hmmm, but I remember Sentence-Transformers initialize mean pooling by default for models missing pooler config: https://github.com/UKPLab/sentence-transformers/blob/dd76c033ac2161a2958fee2e18fd0227a81ee921/sentence_transformers/SentenceTransformer.py#L1508-L1530

[Bugfix] Fix the failing gte embedding test

IIRC, [ssmits/Qwen2-7B-Instruct-embed-base](https://huggingface.co/ssmits/Qwen2-7B-Instruct-embed-base) is just converted from Qwen2-7B-Instruct by removing `lm_head` and overwriting the `config.json`. Perhaps we can convert a smaller ones to allow testing with fp32?

[Bugfix] Fix the failing gte embedding test

Hmmm, I tried to use `Qwen/Qwen2.5-0.5B-Instruct` as embedding models in this mteb tests with fp32, and it can pass the test after using mean pooling on vllm runner: ```python3 #...

[Bugfix] Fix the failing gte embedding test

Oh, I finally found out what's wrong... Don't know why, but seems `hf_model` on CI is using bidirectional attention `is_casual=False` (probably because these Qwen2 converted models don't have `is_casual` in...

[Bugfix] Fix the failing gte embedding test

I'm taking a look into this issue in `transformers` to make a quick fix PR for it directly. Will create a bug report if it's too complicated to fix.

[RFC]: Upstream model implementations

Hmmm, I think the diffusion model implementation isn't quite stable currently (BTW, I think we may need a `SupportDiffusion` interface for DiT models finally): https://github.com/vllm-project/vllm-omni/blob/f1d8a007726376e3f09486b9f3a63ec0aa5d41d5/vllm_omni/worker/gpu_diffusion_model_runner.py#L280-L307 Another problem is that the...

[RFC]: Trigger mechanism of non-GPU CI

> Maintainers decide whether a PR may affect NPU. I prefer 3. But we can also use mergify bot to automatically tag NPU related PR as well.