Stefano Luoni

Results 4 issues of Stefano Luoni

It would be nice to have ReAct returning the actions in the response, to know which tools were used to generate the answer. As now, ReAct only returns the observations,...

### Model description jina-embeddings-v3 is a multilingual multi-task text embedding model designed for a variety of NLP applications. Based on the [Jina-XLM-RoBERTa architecture](https://huggingface.co/jinaai/xlm-roberta-flash-implementation), this model supports Rotary Position Embeddings to...

### System Info Version of Text Embedding Inference: 1.6 (Turing) GPU: 1xTesla T4 16GB Deployment environment: Openshift 4 - Kubernetes version v1.28.15+ff493be Service info: `{"model_id":"naver/efficient-splade-VI-BT-large-doc","model_sha":"main","model_dtype":"float16","model_type":{"embedding":{"pooling":"splade"}},"max_concurrent_requests":512,"max_input_length":512,"max_batch_tokens":16384,"max_batch_requests":null,"max_client_batch_size":32,"auto_truncate":false,"tokenization_workers":1,"version":"1.6.0","sha":"57d8fc8128ab94fcf06b4463ba0d83a4ca25f89b","docker_label":"sha-57d8fc8"}` ### Information - [x] Docker...

### Feature request It would be nice to have late chunking supported by the library, optionally activated by a parameter passed in the request. This feature is available, for example,...