aws-genai-llm-chatbot icon indicating copy to clipboard operation
aws-genai-llm-chatbot copied to clipboard

Add Cohere Rerank 3 Support

Open ystoneman opened this issue 1 year ago • 7 comments

Issue #, if available: #280

Description of changes: Currently, this solution only supports cross-encoder/ms-marco-MiniLM-L-12-v2. I want to add Cohere Rerank 3 as an option.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

ystoneman avatar Apr 21 '24 22:04 ystoneman

@bigadsoleiman @azaylamba -- Do you have any ideas why these changes of adding Cohere Rerank 3 support aren't working?

When I click on the Cross-Encoder Model dropdown menu on the /rag/cross-encoders page, I am still only getting the cross-encoder/ms-marco-MiniLM-L-12-v2 model?

I made sure to define the COHERE_API_KEY in Secrets Manager.

In the browser dev tools, I can see the array only has the one cross-encoder:

[{…}]
0
: 
{provider: 'sagemaker', name: 'cross-encoder/ms-marco-MiniLM-L-12-v2', default: true, __typename: 'CrossEncoderData'}
length
: 
1
[[Prototype]]
: 
Array(0)

ystoneman avatar Apr 21 '24 22:04 ystoneman

Does your config.json have the cohere model once you run npm run config? The dropdown menu loads the list from config object. See get_cross_encoder_models() in lib/shared/layers/python-sdk/python/genai_core/cross_encoder.py

azaylamba avatar Apr 22 '24 12:04 azaylamba

Is the following cdk-nag issue a common issue anyone else has encountered while modifying this project @bigadsoleiman, @massi-ang, or @azaylamba?


- Adjust chunk size limit for this warning via build.chunkSizeWarningLimit.
✓ built in 23.95s
/home/ubuntu/environment/aws-genai-llm-chatbot/node_modules/cdk-nag/src/nag-suppressions.ts:98
    pathArray.forEach((p) => {
              ^
Error: Suppression path "/cloud9GenAIChatBotStack/RagEngines/SageMaker/Model/MultiAB24A/CodeBuildRole/DefaultPolicy/Resource" did not match any resource. This can occur when a resource does not exist or if a suppression is applied before a resource is created.
    at /home/ubuntu/environment/aws-genai-llm-chatbot/node_modules/cdk-nag/src/nag-suppressions.ts:115:15
    at Array.forEach (<anonymous>)
    at Function.addResourceSuppressionsByPath (/home/ubuntu/environment/aws-genai-llm-chatbot/node_modules/cdk-nag/src/nag-suppressions.ts:98:15)
    at new AwsGenAILLMChatbotStack (/home/ubuntu/environment/aws-genai-llm-chatbot/lib/aws-genai-llm-chatbot-stack.ts:273:25)
    at Object.<anonymous> (/home/ubuntu/environment/aws-genai-llm-chatbot/bin/aws-genai-llm-chatbot.ts:13:1)
    at Module._compile (node:internal/modules/cjs/loader:1369:14)
    at Module.m._compile (/home/ubuntu/environment/aws-genai-llm-chatbot/node_modules/ts-node/src/index.ts:1618:23)
    at Module._extensions..js (node:internal/modules/cjs/loader:1427:10)
    at Object.require.extensions.<computed> [as .ts] (/home/ubuntu/environment/aws-genai-llm-chatbot/node_modules/ts-node/src/index.ts:1621:12)
    at Module.load (node:internal/modules/cjs/loader:1206:32)

Subprocess exited with error 1

This happened after I re-ran npm run config and then npx cdk deploy. I had it use the same prefix as before because the stack already existed, so I provided the existing VPC ID. And I selected "no" for VPC endpoints since those were already created by the previous deployment.

Re-running everything did make the cohere rerank model show up in the config file, but now I'm having this new issue.

ystoneman avatar Apr 22 '24 21:04 ystoneman

@ystoneman Can you share the config.json file?

azaylamba avatar Apr 23 '24 16:04 azaylamba

Thanks for following up @azaylamba. Here's my config.json:

{
  "prefix": "cloud9",
  "vpc": {
    "vpcId": "vpc-0906dfbea13ffd463",
    "createVpcEndpoints": false
  },
  "privateWebsite": false,
  "certificate": "",
  "domain": "",
  "cfGeoRestrictEnable": false,
  "cfGeoRestrictList": [],
  "bedrock": {
    "enabled": true,
    "region": "us-east-1"
  },
  "llms": {
    "sagemaker": [],
    "huggingfaceApiSecretArn": ""
  },
  "rag": {
    "enabled": true,
    "engines": {
      "aurora": {
        "enabled": false
      },
      "opensearch": {
        "enabled": true
      },
      "kendra": {
        "enabled": false,
        "createIndex": false,
        "external": [],
        "enterprise": false
      }
    },
    "embeddingsModels": [
      {
        "provider": "sagemaker",
        "name": "intfloat/multilingual-e5-large",
        "dimensions": 1024
      },
      {
        "provider": "sagemaker",
        "name": "sentence-transformers/all-MiniLM-L6-v2",
        "dimensions": 384
      },
      {
        "provider": "bedrock",
        "name": "amazon.titan-embed-text-v1",
        "dimensions": 1536
      },
      {
        "provider": "bedrock",
        "name": "amazon.titan-embed-image-v1",
        "dimensions": 1024
      },
      {
        "provider": "bedrock",
        "name": "cohere.embed-english-v3",
        "dimensions": 1024,
        "default": true
      },
      {
        "provider": "bedrock",
        "name": "cohere.embed-multilingual-v3",
        "dimensions": 1024
      },
      {
        "provider": "openai",
        "name": "text-embedding-ada-002",
        "dimensions": 1536
      }
    ],
    "crossEncoderModels": [
      {
        "provider": "sagemaker",
        "name": "cross-encoder/ms-marco-MiniLM-L-12-v2",
        "default": true
      },
      {
        "provider": "cohere",
        "name": "rerank-english-v3.0"
      }
    ]
  }
}

ystoneman avatar Apr 24 '24 02:04 ystoneman

Hi @ystoneman, this issue is due to some resource that was included in the nag-suppression rules to not be present any more. In particular /cloud9GenAIChatBotStack/RagEngines/SageMaker/Model/MultiAB24A/CodeBuildRole/DefaultPolicy/Resource. The current logic applies this suppression rule when opensearch or auroradb are selected as RAG Engines. Now I suppose that in your case you have disabled the SM cross encoder and only use the external cohere reranker which might explain this issue.

massi-ang avatar Apr 24 '24 08:04 massi-ang

Hi @ystoneman, this issue is due to some resource that was included in the nag-suppression rules to not be present any more. In particular /cloud9GenAIChatBotStack/RagEngines/SageMaker/Model/MultiAB24A/CodeBuildRole/DefaultPolicy/Resource. The current logic applies this suppression rule when opensearch or auroradb are selected as RAG Engines. Now I suppose that in your case you have disabled the SM cross encoder and only use the external cohere reranker which might explain this issue.

Hi @massi-ang, thanks for your response. I don't think I'm disabling the SageMaker cross-encoder, because both cross-encoders are specified in the config.json.

My desired behavior is that the SageMaker cross-encoder still gets deployed, but I want to provide the ability to toggle between that and the Cohere rerank-english-v3.0 external API endpoint in the cross-encoder dropdown in the UI.

Could you please clarify if there's a better way to handle this scenario in the suppression rules?

ystoneman avatar May 07 '24 19:05 ystoneman