kaixuanliu

Results 28 comments of kaixuanliu

@OlivierDehaene , Hi, can you help review? Thx

@Narsil @OlivierDehaene , Hi I have updated this PR with latest code base. Pls help review.

@regisss @Narsil pls help review

Steps to re-produce on CPU: 1. build the docker container `docker build --build-arg PLATFORM="cpu" -f Dockerfile-intel -t tei_cpu .` 2. start backend: ``` model='jinaai/jina-embeddings-v2-base-code' volume=$PWD/data docker run -p 8080:80 -v...

I have not met other models with this unexpected behavior. But these three I listed above are in TEI README part. Our customers are asking support for these 3 models:...

Will change to another safer manner to support these 3 models

@Narsil Hi, I validated this PR, on CPU and XPU, it works well. But on HPU, I cannot start the backend, and I did a check by `pip list`, it...

In [#1292](https://github.com/huggingface/optimum-habana/pull/1292), `args.use_kv_cache` was set to `False` by default, which will greatly slow down the performance, and cause CI failed.

Cmd line: `python3 run_pipeline.py --model_name_or_path google/paligemma-3b-mix-224 --use_hpu_graphs --bf16`, this PR should work with [#1279](https://github.com/huggingface/optimum-habana/pull/1279)