rawsh-rubrik

Results 6 comments of rawsh-rubrik

Getting batch_size=32 avg tokens per sentence=1024 embeddings/sec: 47.83 with `BAAI/bge-reranker-v2-m3` + --no-bettertransformer

Will try this, but I have flash-attn installed in this image ``` FROM nvidia/cuda:12.1.1-devel-ubuntu22.04 AS base ENV PYTHONUNBUFFERED=1 \ \ # pip PIP_NO_CACHE_DIR=off \ PIP_DISABLE_PIP_VERSION_CHECK=on \ PIP_DEFAULT_TIMEOUT=100 \ \ PYTHON="python3.10"...

@michaelfeil gotcha thanks! will fork for now

Not sure what cases, seemed to happen a few thousand requests in

@michaelfeil [Yes](https://huggingface.co/Xenova/ms-marco-TinyBERT-L-2-v2), also throws the same for me