mathislucka comments

Results 23 comments of


                                            mathislucka

Support M1 GPU in FARMReader

Reopening this, as the device is not used for the inferencer. See: https://github.com/deepset-ai/haystack/blob/632cd1c141a8b485c6ef8695685d2d8eef3ca50f/haystack/modeling/infer.py#L229

Additionally, currently transformers does not support pytorch 1.12 (see https://github.com/huggingface/transformers/issues/17971#issuecomment-1172324921). When changing the code in inferencer to pass on the `mps` device. An error is raised during prediction: ``` Inferencing...

Support M1 GPU in FARMReader

Also see this for the current state of covered ops for the mps backend: https://github.com/pytorch/pytorch/issues/77764

Tutorial 09: Update to EmbeddingRetriever Training

There is this dataset with cross-encoder scores for MarginMSELoss: https://huggingface.co/datasets/sentence-transformers/msmarco-hard-negatives I'd vote for implementing `MultipleNegativesRankingLoss` because MarginMSE is already used in GPL and MNRL also yields very good results. What...

Tutorial 09: Update to EmbeddingRetriever Training

Oh and training definitely makes sense. If you have labeled data, you will get much better results with training than with the out-of-the-box models.

Tutorial 09: Update to EmbeddingRetriever Training

My 2 cents: > Which sentence-transformer model(s) do we suggest for out-of-the-box use? > Now it makes sense to use and promote v5 models (?) The v5 models are only...

add sql-like query strings for more convenience

Yes, your understandig is correct. I understand your concerns but I think I addressed these complications: - the filter parameters can be passed as JSON representations of the parameters needed....

Support for new embedding models

@sjrl already found a way to make e5 work. One thing that we could improve upon though is that e5 requires documents to be prefixed with `passage:` and queries with...

Proposal to add file similarity retriever to haystack

> This node uses composition with another node which is generally not something we would like for v2. Instead, we would probably prefer if the node was split in two...