how to add the "query similarity score" to the object in response tree
Is it possible to add the similarity score of the object attribute in the Tree response?
Thank you
If your agent use "hybrid" or "vector" query to retrive data it can be possible. If no one attempts to do this within a few days, I can handle in both the backend and front end sides.
in elysia/tools/retrieval/util.py between line 606-630 there is part for query you can add return_metadata=MetadataQuery(distance=True) to return distances but you have to modify/refactor dtos/classes, return objects and all others parts in codebase.
if tool_args["search_type"] == "hybrid":
response = await collection.query.hybrid(
query=tool_args["search_query"],
limit=tool_args["limit"],
filters=combined_filter,
return_references=reference,
target_vector=(
named_vector_fields[collection.name]
if named_vector_fields
else None
),
)
elif tool_args["search_type"] == "vector":
response = await collection.query.near_text(
query=tool_args["search_query"],
limit=tool_args["limit"],
filters=combined_filter,
return_references=reference,
target_vector=(
named_vector_fields[collection.name]
if named_vector_fields
else None
),
)
In Weaviate, setting distance=True in the return_metadata parameter (return_metadata=MetadataQuery(distance=True)) of a query instructs the client to include the vector distance between the query and each returned object in the search results. This distance is a measure of similarity: a lower distance means higher similarity between the query and the object.
Docs: https://docs.weaviate.io/academy/py/starter_text_data/text_searches/semantic https://docs.weaviate.io/weaviate/search/similarity
Yep, @gulyabaniTR is spot on! We need to edit the weaviate query code itself to be able to achieve this, and then it'd be necessary to add a field to the returned objects themselves. It'd need to added alongside the actual object fields in the returned Result/Retrieval object, and then the frontend would need to be aware that this particular field for similarity score is not a field in the data, but a piece of metadata. We should probably have a standard string indicator e.g. "ELYSIA_*" gets auto-ignored by the frontend, but let's leave that for another time.
@gulyabaniTR I can assign this to you if you would like? I'm also happy to do it. I don't see any harm in adding this to the retrieved objects if possible. We'd just need to make sure it's not explicitly used anywhere else in the tree otherwise it will conflict with when there's no scores (e.g. fetch_objects)
Hi Danny, thanks for your comment and guidance. This feature/improvement definitely affects a lot of areas in the codebase, so we need to proceed carefully, as you mentioned.
And yes, if it's not urgent, I'd love to look into this. I'll be waiting for you to provide the necessary guidance.
Amazing! So just like you said, the main change would be to add the return metadata to hybrid and near_text in: https://github.com/weaviate/elysia/blob/c5b6451daf2477ad14032b0ecb5b2c3a67d63bd3/elysia/tools/retrieval/util.py#L606-L630
Then augment the objects with some kind of new property, similar to how uuid is added:
https://github.com/weaviate/elysia/blob/c5b6451daf2477ad14032b0ecb5b2c3a67d63bd3/elysia/tools/retrieval/query.py#L570-L573
E.g. a ELYSIA_SIMILARITY_SCORE field, or similar.
Then I think the Retrieval object will need to be default initialised that the field ELYSIA_SIMILARITY_SCORE won't be overwritten when mapping the base object fields to those recognised by the frontend, so the unmapped_keys should include this field name:
https://github.com/weaviate/elysia/blob/c5b6451daf2477ad14032b0ecb5b2c3a67d63bd3/elysia/objects.py#L824-L843
This should be enough, I believe to give the LLM what the similarity score is. Then all we'd need is to give the LLM this knowledge when it summarises retrieved information, so adding an explanation of how to use this field (just explaining that this is a similarity score, higher=better) into the prompt, probably in the cited summariser:
https://github.com/weaviate/elysia/blob/c5b6451daf2477ad14032b0ecb5b2c3a67d63bd3/elysia/tools/text/prompt_templates.py#L6-L50
Separately, the frontend should then be able to use the ELYSIA_SIMILARITY_SCORE field to display the scores when objects are retrieved.
I'd recommend opening a feature request at https://github.com/weaviate/elysia-frontend when your PR here is done, linking to it so that it can be implemented in due time. Or, making a separate PR over there manually, if you feel up to it.
I think this should be all the changes required, but of course it will be worth testing :) If you have some experimental setups in Python, make sure to use e.g. tree.complex_lm.inspect_history(5) or tree.base_lm.inspect_history(5) to see the raw LLM calls and check they include the new fields to ensure its been passed down to the tree correctly. Good luck and let me know if you need more advice!
Any news ? UP