Stefano Luoni issues

Results 4 issues of


                                            Stefano Luoni

ReAct doesn't return actions executed

It would be nice to have ReAct returning the actions in the response, to know which tools were used to generate the answer. As now, ReAct only returns the observations,...

Support jinaai/jina-embeddings-v3

### Model description jina-embeddings-v3 is a multilingual multi-task text embedding model designed for a variety of NLP applications. Based on the [Jina-XLM-RoBERTa architecture](https://huggingface.co/jinaai/xlm-roberta-flash-implementation), this model supports Rotary Position Embeddings to...

Backend error: `normalize` is not available for SPLADE models

### System Info Version of Text Embedding Inference: 1.6 (Turing) GPU: 1xTesla T4 16GB Deployment environment: Openshift 4 - Kubernetes version v1.28.15+ff493be Service info: `{"model_id":"naver/efficient-splade-VI-BT-large-doc","model_sha":"main","model_dtype":"float16","model_type":{"embedding":{"pooling":"splade"}},"max_concurrent_requests":512,"max_input_length":512,"max_batch_tokens":16384,"max_batch_requests":null,"max_client_batch_size":32,"auto_truncate":false,"tokenization_workers":1,"version":"1.6.0","sha":"57d8fc8128ab94fcf06b4463ba0d83a4ca25f89b","docker_label":"sha-57d8fc8"}` ### Information - [x] Docker...

Support late chunking

### Feature request It would be nice to have late chunking supported by the library, optionally activated by a parameter passed in the request. This feature is available, for example,...