ml-commons http connector with hugging face: enabling rag pipeline

I am trying to implement a rag pipeline using the pretrained HuggingFace model. I am having trouble building a custom blueprint for the HuggingFace model. I have adopted this connector, with modifications, from a previous issue #1468:

{
    "name": "sentence-transformers/all-MiniLM-L6-v2",
    "description": "The connector to Hugging Face GPT Language Model",
    "version": "1.0.1",
    "protocol": "http",
    "parameters": {
        "endpoint": "api-inference.huggingface.co",
        "model": "gpt2",
        "temperature": 1.0
    },
    "credential": {
        "HF_key": "my_API_key"
    },
    "actions": [
        {
            "action_type": "predict",
            "method": "POST",
            "url": "https://api-inference.huggingface.co/models/sentence-transformers/all-MiniLM-L6-v2",
            "headers": {
                "Authorization": "Bearer ${credential.HF_key}"
            },
            "request_body": """{ 
                "model": "${parameters.model}", 
                "messages": ${parameters.messages},
                "temperature": ${parameters.temperature},
                "inputs" : {"source_sentence": "source", "sentences": ["sentence"]}
            }"""
        }
    ]
}

I have set up the cluster settings, registered and deployed the model using this connector, and have set up the rag pipeline. I have set up the index to use rag by default like this:

index_body = {
      "settings": {
          "index.number_of_shards" : 4, 
          "index.search.default_pipeline" : "rag_pipeline"
      },
      "mappings": {
          "properties": {
          "text": {
              "type": "text"
          }
          }
      }
}

response = self.client.indices.create(name, index_body)

I try to make a search this way:

query = {
    "query": {
        "match": {
        "text": q
        }
    },
    "ext": {
        "generative_qa_parameters": {
        "llm_question": q,
        "llm_model": self.llm_model,
        "context_size": 10,
        "timeout": 60
        }
    }
}
response = self.client.search(
    body = query,
    index = index_name
)

And I get the following error: opensearchpy.exceptions.TransportError: TransportError(500, 'null_pointer_exception', 'Cannot invoke "java.util.Map.get(Object)" because "error" is null')

I think this is an error with the connector, specifically with HuggingFace's Inference API endpoints. How do I configure the connector to correctly interact with HuggingFace API? The request body contains all parameters needed to make a request, according to the HF documentation. What should I fix?

Apr 16 '24 01:04 MiaNCSU

Not quite get you, the connector you shared is a text embedding model. Are you using it as LLM in RAG pipeline ? Can you share the RAG pipeline configuration?

Apr 26 '24 00:04 ylwu-amzn

Yes, I am using it as an LLM in the RAG pipeline. I put the following request body into the /_search/pipeline/rag_pipeline endpoint:

  data = {
      "response_processors": [
          {
              "retrieval_augmented_generation": {
                  "tag": "rag_pipeline",
                  "description": "HuggingFace Connector pipeline",
                  "model_id": MODEL_ID,
                  "context_field_list": ["text"],
                  "system_prompt": "You are a dataset search tool",
                  "user_instructions": "For a given search query, list the top 10 most relevant datasets as answers."
              }
          }
      ]
  }

The connector I shared is for the huggingface/sentence-transformers/all-MiniLM-L6-v2 model, which is one of the pretrained models listed on Opensearch: supported pretrained models

Is this model not suitable for the rag pipeline?

Apr 26 '24 01:04 MiaNCSU

@MiaNCSU , huggingface/sentence-transformers/all-MiniLM-L6-v2 is not a LLM for RAG pipeline, it's a model for generate text embedding. You can try LLM like OpenAI GPT, Anthropic Claude 3 etc.

May 08 '24 00:05 ylwu-amzn

@MiaNCSU , will close this issue since no reply after 2 weeks. Feel free to reply and open if you still have question.

May 23 '24 03:05 ylwu-amzn

Hello,

Are OpenAI GPT, Anthropic Claude 3 the only possible options? Is the opensearch rag pipeline not compatible with any open source LLM models?

On Wed, May 22, 2024 at 11:58 PM Yaliang Wu @.***> wrote:

@MiaNCSU https://github.com/MiaNCSU , will close this issue since no reply after 2 weeks. Feel free to reply and open if you still have question.

— Reply to this email directly, view it on GitHub https://github.com/opensearch-project/ml-commons/issues/2325#issuecomment-2126184909, or unsubscribe https://github.com/notifications/unsubscribe-auth/AZHWVFO3TSQWYXD3EN24NOLZDVSNXAVCNFSM6AAAAABGILTBDCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRWGE4DIOJQHE . You are receiving this because you were mentioned.Message ID: @.***>

May 24 '24 20:05 MiaNCSU

ml-commons ml-commons copied to clipboard

http connector with hugging face: enabling rag pipeline

ml-commons
ml-commons copied to clipboard