ml-commons
ml-commons copied to clipboard
Support for RAG with Q&A LLM Model via Connector. Currently we can do RAG only with text generation models
Support for RAG with Q&A Model via Connector. Currently we can do RAG only with text generation models
I am using Q&A model for RAG
The Q&A Model takes two inputs as question and context
{
"question": "where live??",
"context": "My name is Clara and I live in Madrid."
}
While doing neural search how i will pass the parameter for question and context
Where should i add the context? Or utilizing two paramters is not possible in OpenSearch RAG?
{
"query": {
"neural": {
"neural_field_vector": {
"query_text": {{question}},
"model_id": "90tQBI0BCjm4TK7_mz2B",
"k": 1
}
}
},
"size": 1,
"_source": [ "neural_field"],
"ext": {
"generative_qa_parameters": {
"llm_model": "bedrock/q&a model",
"llm_question": {{question}},
"conversation_id": "eHFSxIwBu3N8jyGnExM0",
"context_size": 5,
"interaction_size": 2,
"timeout": 15 }
}
}
question and answer is not part of my document/index records. Only neural field is part of my document which is vectorized with neural_field_vector. When i do neural search by asking question it gives the response of neural_field now the question that i will ask to neural search will give the response of neural_field which is working perfectly. Now this same question should go to my LLM as question and the neural_field which came as a response from neural search should map with the context . In llm RAG i do not have the mapping ways, all i have is context_field_list which is part of document and not the LLM parameter.
We need to support this, there can be multiple parameters in the models as model advances.
@mingshl could we solve this with a generic ml search processors?
@mingshl could we solve this with a generic ml search processors?
for the ml inference search processors, yes, it can use the QA model to achieve the QA search. the generative_qa_parameters can be put in the model_config parameters in the inference processors.
for example,
{
"description": "test ml model search request processor",
"processors":
"response_processors": [ {
"ml_inference": {
"model_id": "90tQBI0BCjm4TK7_mz2B",
"model_config":{
"generative_qa_parameters":{
"conversation_id": "eHFSxIwBu3N8jyGnExM0",
"context_size": 5,
"interaction_size": 2,
"timeout": 15 }
},
"input_map": [
{
"query_text": "llm_question",
"neural_query_result_field": "context"
}
],
"output_map": [
{
"llm_response": "answer"
}
]
}
}]
}
users can pipe up search request processor, either using ml_inference search request processors or text_embedding_request processors, both should work then user should be able to use standard neural queries, the context will be a
{
"query": {
"neural": {
"neural_field_vector": {
"query_text": {{question}},
"model_id": "90tQBI0BCjm4TK7_mz2B", //<text_embedding_model_id >
"k": 1
}
}
}