core [Feature] Enabling multiple recall queries

Is your feature request related to a problem? Please describe. Whenever a user asks multiple questions in the same message (e.g. "How is the element X working? Who should I email to get access on it?") or a question that can be splitted in N sub questions, as plugin developer I may want a primitive to perform multiple recall in parallel instead of a single one recall.

It may be useful to add in the hook cat_recall_query the possibility to handle an array of string instead of a single string

Describe the solution you'd like Modify the hook cat_recall_query return type from string to list of string, in this way a plugin developer can decide if the recall query have to be a single string (like it is now) or multiple strings.

With small effort this change can still be compatible with plugins using cat_recall_query returning just a string instead of a list.

Mar 03 '25 16:03 AlessandroSpallina

Hi @AlessandroSpallina thanks for suggesting

Doing it directly will break plugins using cat_recall_query, or force us to handle the hook in a specific way in the MadHatter (which is a pain in the a**)

If you like it, create a new hook just after cat_recall_query, called cat_recall_queries, having as a default value [recall_query]

recall_queries = self.mad_hatter.execute_hook(
    "cat_recall_queries", [recall_query], cat=self
)

Not sure how to handle the code afterwards, the loop should regard embeddings and db query but not the recall configs Also, should all the results be concatenated in working memory? What do you think

Mar 04 '25 13:03 pieroit

My first idea was to handle a double kind of return of cat_recall_query not in MadHatter, but directly in StrayCat, something like (diff of stray_cat.py L296):

- recall_query = self.mad_hatter.execute_hook(
+ recall_queries = self.mad_hatter.execute_hook(
    "cat_recall_query", recall_query, cat=self
)
+ if isinstance(recall_queries, str):
+     recall_queries = [recall_queries]
- log.info(f"Recall query: '{recall_query}'")
+ log.info(f"Recall queries: ';'.join(recall_queries)")

and then loop for each recall query.

In this way we support str | list[str] as return value of the hook, the drawback I see is that the hook name cat_recall_query suggests one single recall query, but they can be multiple in reality. Do you still see issues like this?

The results should be concatenated in working memory, but I'm not sure about the K: if a query is splitted in N recall_query, how many chunks for each query should stay in working memory? K (so having K*N chunks) or K/N?

Mar 04 '25 15:03 AlessandroSpallina

Maybe you need a Decomposition fo the query and recall QxN chunks for each memory and append them in each list without duplicates.

Mar 05 '25 10:03 nickprock

Indeed, this proposal enables query decomposition. Do you agree on the hook modification?

So, according to @nickprock

The results should be concatenated in working memory, but I'm not sure about the K: if a query is splitted in N recall_query, how many chunks for each query should stay in working memory? K (so having K*N chunks) or K/N?

K*N - duplicated chunks

Mar 05 '25 12:03 AlessandroSpallina

Indeed, this proposal enables query decomposition. Do you agree on the hook modification?

Your solution does not work, because hooks are executed in pipe and there is no way to check if string or list in the middle of the pipe, you just check at the end

So, according to @nickprock

The results should be concatenated in working memory, but I'm not sure about the K: if a query is splitted in N recall_query, how many chunks for each query should stay in working memory? K (so having K*N chunks) or K/N?

K*N - duplicated chunks

This is getting unnecessarily complicated, I would say hook after_cat_recalled_memories and do your own logic to split the query, do more recalls, and concatenate to the original recall. I guess in any case you want to do a query with the full user question(s).

If there is more request for this, we'll think it better

Mar 06 '25 11:03 pieroit

With v2 you can pass in the http request any number of resources (uris corresponding to files or anything) or add them via plugin, and decide what to do always via plugin with a custom agent or by hooking prompt construction.

There is no recall function in core now, will remain in vector memory plugin for legacy. Since custom agents are classes and you can have as many as you want, you will be able to define any kind of recall strategy (including no recall, recall before LLM execution, recall at every agent step, recall as a tool).

Closing this, thank you

Nov 14 '25 19:11 pieroit

core core copied to clipboard

[Feature] Enabling multiple recall queries

core
core copied to clipboard