Add instructions for embedding
Hi, maybe a feature request but lots of opensource embedding models require additionnal instructions for best performance (ex. nomic-embed-text-v1.5, multilingual-e5-large-instruct).
Is there a way to provide theses instructions ?
Regards.
Hey, so graphiti has a generic abstract EmbedderClient that is used for our actual calls. It implements embedderClient.create and embedderClient.create_batch methods. If you want to use an embedder that isn't currently supported you can define a new class that inherits from EmbedderClient and implements those methods.
For example, you could make a NomicEmbed(EmbedderClient) class and either have it instantiate the embedding instructions or use some defaults on the nomicEmbed.create call. I can help advise on this and review if you need help implementing this.
Also if you create a new working embedder client for one of these models we would also be happy to merge in your PR so others in the community can use it as well!
Hi, thanks for your answer.
I looked quickly at the code but I don't think it will be a quick and easy implementation. Thoses embedder generally require different prompt for document embedding vs retrieval. Actually a lot of different prompts, depending on the type of source for multilingual-e5-large-instruct https://github.com/microsoft/unilm/blob/9c0f1ff7ca53431fe47d2637dfe253643d94185b/e5/utils.py#L106
I'm not sure how to implement this but it shoud'nt be infeasible I gues.
Regards.
@berengerdoneux Is this still relevant? Please confirm within 14 days or this issue will be closed.
@berengerdoneux Is this still an issue? Please confirm within 14 days or this issue will be closed.
@berengerdoneux Is this still an issue? Please confirm within 14 days or this issue will be closed.