Skip context search if systemMessage don't have {context}
Thank you for your feedback!
I am programmatically editing the system message, stripping {context} if it's not required.
This way, I can avoid any unnecessary vectorization (and cie) in those situations where the {context} doesn't actually contribute to the message (the main goal of this PR).
After rereading searchDocumentAndCreateSystemMessage, I believe its primary purpose is to manage the steps for substituting the context. However, my proposal is to streamline that process by skipping unnecessary substitutions when the context isn't needed.
I can suggest an alternative approach if you're open to that. Here's what I have in mind:
/**
* @param array<string, string|int>|array<mixed[]> $additionalArguments
*/
public function answerQuestion(string $question, int $k = 4, array $additionalArguments = []): string
{
$contextIsExpected = str_contains($this->systemMessageTemplate, '{context}');
$systemMessage = $contextIsExpected
? $this->searchDocumentAndCreateSystemMessage($question, $k, $additionalArguments)
: $this->getSystemMessage('');
$this->chat->setSystemMessage($systemMessage);
return $this->chat->generateText($question);
}
So your idea is to have a way to skip RAG processing at all, isn't it?
Uhm, maybe it wolud be better achieved putting the check if str_contains($this->systemMessageTemplate, '{context}') inside searchDocumentAndCreateSystemMessage, so that this behaviour could be consisten in every place we call the method (answerQuestion, answerQuestionFromChat and answerQuestionStream).
I would like to have the opinion of @MaximeThoonsen on this
hey @RobinDev . Thanks for this contribution. I'm not sure to understand why you wouldn't call generateChatdirectly in this case if you don't need the search part?
So your idea is to have a way to skip RAG processing at all, isn't it? Uhm, maybe it wolud be better achieved putting the check if
str_contains($this->systemMessageTemplate, '{context}')insidesearchDocumentAndCreateSystemMessage, so that this behaviour could be consisten in every place we call the method (answerQuestion,answerQuestionFromChatandanswerQuestionStream). I would like to have the opinion of @MaximeThoonsen on this
Yes, I can do that (i do not use stream so i went to the quickest path for me).
ey @RobinDev . Thanks for this contribution. I'm not sure to understand why you wouldn't call
generateChatdirectly in this case if you don't need the search part?
I use the same user interface to query different collections from the same stack (same vectorized content, same db).
Sometimes, users select no collections at all, and in those cases, I want to avoid vectorizing the user question for performance.
Can you confirm me it's OK before I do a better PR to cover stream & cie ?
Thanks.
@RobinDev I see, let's go for that
updated PR