kaixuanliu comments

Results 28 comments of


                                            kaixuanliu

Enable intel devices CPU/XPU/HPU for python backend

@OlivierDehaene , Hi, can you help review? Thx

add reranker model support for python backend

@Narsil @OlivierDehaene , Hi I have updated this PR with latest code base. Pls help review.

Refine model file download for python backend

@regisss @Narsil pls help review

Refine model file download for python backend

Steps to re-produce on CPU: 1. build the docker container `docker build --build-arg PLATFORM="cpu" -f Dockerfile-intel -t tei_cpu .` 2. start backend: ``` model='jinaai/jina-embeddings-v2-base-code' volume=$PWD/data docker run -p 8080:80 -v...

Refine model file download for python backend

I have not met other models with this unexpected behavior. But these three I listed above are in TEI README part. Our customers are asking support for these 3 models:...

Refine model file download for python backend

Will change to another safer manner to support these 3 models

Moving to `uv` to enable dependency override and cleaner locks

@Narsil Hi, I validated this PR, on CPU and XPU, it works well. But on HPU, I cannot start the backend, and I did a check by `pip list`, it...

change deafult `use_cache` param to `True`, to align with the former implementation and make CI pass

In [#1292](https://github.com/huggingface/optimum-habana/pull/1292), `args.use_kv_cache` was set to `False` by default, which will greatly slow down the performance, and cause CI failed.

change deafult `use_cache` param to `True`, to align with the former implementation and make CI pass

@regisss @libinta Pls help review

enable `paligemma` model

Cmd line: `python3 run_pipeline.py --model_name_or_path google/paligemma-3b-mix-224 --use_hpu_graphs --bf16`, this PR should work with [#1279](https://github.com/huggingface/optimum-habana/pull/1279)