wirthual
wirthual
Hi @shalousun this is a great idea. Currently the logger init happens [here](https://github.com/michaelfeil/infinity/blob/f98ccf49b6146b225afc6f679539be9a2e13af68/libs/infinity_emb/infinity_emb/log_handler.py#L16).
Try an adapted version like this: ``` infinity_emb v2 --model-id tomaarsen/Qwen3-Reranker-0.6B-seq-cls ``` Note: You might be required to update your transformers version.
Hi @fabriziofortino , thanks for the detailed issue. Can you run infinity with `--engine torch` and see if you get the expected output?
With optimum infinity uses the quantized model by default on CPU if provided by the HF repo. In order to compare the outputs, we need to make sure we run...
@fabriziofortino I assume the used model is still quantized (On CPU thats the default). Working on this PR #635 for easy selection of the unquantized version. Did a quick test...
Hi @amit-jain > * How do we disable quantization with optimum - use `--dtype float32` for above? And it will also lead to increase in latency right? > going forward,...
Absolutely. According to[ this issue](https://github.com/huggingface/optimum/issues/2277): > please use transformers' attention implementation: https://huggingface.co/docs/transformers/main/en/llm_optims#attention and torch.compile (with static cache if decoder): https://huggingface.co/docs/transformers/main/en/llm_optims#static-kv-cache-and-torchcompile for the best possible performance (exceeding bettertransformer, which no one...
@michaelfeil One option to keep bettertransformer as was is #641. Extracted bettertransformer code in its own package. This should get rid of version error while keeping the orig bettertransformer code....
I tried to reproduce your error using this Dockerfile: ```Dockerfile FROM michaelf34/infinity:latest-cpu WORKDIR /app RUN apt-get update && apt-get install -y git git-lfs && rm -rf /var/lib/apt/lists/* RUN git lfs...
Hi, The easiest way would be to provide the `task` and `prompt` at the startup of the model. Would this cover your use case? Prompt could be passed in and...