Zoltan Fedor

Results 78 comments of Zoltan Fedor

Hah, I just linked from my old issue to your newly created one, as I thought maybe the one you created will get more attention than mine. Hah :-)

@etiennedi , I am getting confused, is the filtering with BM25 is already supported? I was just looking at my Haystack code and saw that just a few days ago...

Okey, then that Haystack PR cannot be correct. That is odd, as it has a unittest which I wrote back in July which catches the error thrown by Weaviate and...

Hi @etiennedi , As suspected, that Haystack PR was wrong, incorrectly assumed that Weaviate now supports filters with BM25 (and also included a bug causing it in reality run an...

The note from the KV-cache implementation on BART states: _"Note: current implementation of K-V cache does not exhibit performance gain over the non K-V cache TensorRT version. Please consider to...

We are very much looking forward to that! Hopefully that also applies to the scenario of the OP - large inputs to T5 models.

> While waiting for this update, we started using NVIDIA's FasterTransformer library instead. It has a highly optimized T5 GPU runtime with KV cache supported and it's 5-10x faster than...

Also Haystack could be used for model serving - as a replacement for OpenAI for those who want to server their own LLM. Haystack pipelines integrate with Ray Serve to...

Yeah, I do use haystack pipelines with nodes acting as clients for NVIDIA Triton for serving the LLMs locally / building langhchain tools for the agent blazing fast.

@notkriswagner, Thanks. I have actually never tested by killing a PHP script. No, I have Apache - PHP 7.2 (mod_php7) and when http calls which execute a Snowflake query get...