ml-commons
ml-commons copied to clipboard
http connector with hugging face: enabling rag pipeline
I am trying to implement a rag pipeline using the pretrained HuggingFace model. I am having trouble building a custom blueprint for the HuggingFace model. I have adopted this connector, with modifications, from a previous issue #1468:
{
"name": "sentence-transformers/all-MiniLM-L6-v2",
"description": "The connector to Hugging Face GPT Language Model",
"version": "1.0.1",
"protocol": "http",
"parameters": {
"endpoint": "api-inference.huggingface.co",
"model": "gpt2",
"temperature": 1.0
},
"credential": {
"HF_key": "my_API_key"
},
"actions": [
{
"action_type": "predict",
"method": "POST",
"url": "https://api-inference.huggingface.co/models/sentence-transformers/all-MiniLM-L6-v2",
"headers": {
"Authorization": "Bearer ${credential.HF_key}"
},
"request_body": """{
"model": "${parameters.model}",
"messages": ${parameters.messages},
"temperature": ${parameters.temperature},
"inputs" : {"source_sentence": "source", "sentences": ["sentence"]}
}"""
}
]
}
I have set up the cluster settings, registered and deployed the model using this connector, and have set up the rag pipeline. I have set up the index to use rag by default like this:
index_body = {
"settings": {
"index.number_of_shards" : 4,
"index.search.default_pipeline" : "rag_pipeline"
},
"mappings": {
"properties": {
"text": {
"type": "text"
}
}
}
}
response = self.client.indices.create(name, index_body)
I try to make a search this way:
query = {
"query": {
"match": {
"text": q
}
},
"ext": {
"generative_qa_parameters": {
"llm_question": q,
"llm_model": self.llm_model,
"context_size": 10,
"timeout": 60
}
}
}
response = self.client.search(
body = query,
index = index_name
)
And I get the following error:
opensearchpy.exceptions.TransportError: TransportError(500, 'null_pointer_exception', 'Cannot invoke "java.util.Map.get(Object)" because "error" is null')
I think this is an error with the connector, specifically with HuggingFace's Inference API endpoints. How do I configure the connector to correctly interact with HuggingFace API? The request body contains all parameters needed to make a request, according to the HF documentation. What should I fix?
Not quite get you, the connector you shared is a text embedding model. Are you using it as LLM in RAG pipeline ? Can you share the RAG pipeline configuration?
Yes, I am using it as an LLM in the RAG pipeline. I put the following request body into the /_search/pipeline/rag_pipeline endpoint:
data = {
"response_processors": [
{
"retrieval_augmented_generation": {
"tag": "rag_pipeline",
"description": "HuggingFace Connector pipeline",
"model_id": MODEL_ID,
"context_field_list": ["text"],
"system_prompt": "You are a dataset search tool",
"user_instructions": "For a given search query, list the top 10 most relevant datasets as answers."
}
}
]
}
The connector I shared is for the huggingface/sentence-transformers/all-MiniLM-L6-v2 model, which is one of the pretrained models listed on Opensearch:
supported pretrained models
Is this model not suitable for the rag pipeline?
@MiaNCSU , huggingface/sentence-transformers/all-MiniLM-L6-v2 is not a LLM for RAG pipeline, it's a model for generate text embedding. You can try LLM like OpenAI GPT, Anthropic Claude 3 etc.
@MiaNCSU , will close this issue since no reply after 2 weeks. Feel free to reply and open if you still have question.
Hello,
Are OpenAI GPT, Anthropic Claude 3 the only possible options? Is the opensearch rag pipeline not compatible with any open source LLM models?
On Wed, May 22, 2024 at 11:58 PM Yaliang Wu @.***> wrote:
@MiaNCSU https://github.com/MiaNCSU , will close this issue since no reply after 2 weeks. Feel free to reply and open if you still have question.
— Reply to this email directly, view it on GitHub https://github.com/opensearch-project/ml-commons/issues/2325#issuecomment-2126184909, or unsubscribe https://github.com/notifications/unsubscribe-auth/AZHWVFO3TSQWYXD3EN24NOLZDVSNXAVCNFSM6AAAAABGILTBDCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRWGE4DIOJQHE . You are receiving this because you were mentioned.Message ID: @.***>