aws-genai-llm-chatbot
aws-genai-llm-chatbot copied to clipboard
Add Cohere Rerank 3 Support
Issue #, if available: #280
Description of changes: Currently, this solution only supports cross-encoder/ms-marco-MiniLM-L-12-v2. I want to add Cohere Rerank 3 as an option.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
@bigadsoleiman @azaylamba -- Do you have any ideas why these changes of adding Cohere Rerank 3 support aren't working?
When I click on the Cross-Encoder Model dropdown menu on the /rag/cross-encoders page, I am still only getting the cross-encoder/ms-marco-MiniLM-L-12-v2 model?
I made sure to define the COHERE_API_KEY in Secrets Manager.
In the browser dev tools, I can see the array only has the one cross-encoder:
[{…}]
0
:
{provider: 'sagemaker', name: 'cross-encoder/ms-marco-MiniLM-L-12-v2', default: true, __typename: 'CrossEncoderData'}
length
:
1
[[Prototype]]
:
Array(0)
Does your config.json have the cohere model once you run npm run config? The dropdown menu loads the list from config object. See get_cross_encoder_models() in lib/shared/layers/python-sdk/python/genai_core/cross_encoder.py
Is the following cdk-nag issue a common issue anyone else has encountered while modifying this project @bigadsoleiman, @massi-ang, or @azaylamba?
- Adjust chunk size limit for this warning via build.chunkSizeWarningLimit.
✓ built in 23.95s
/home/ubuntu/environment/aws-genai-llm-chatbot/node_modules/cdk-nag/src/nag-suppressions.ts:98
pathArray.forEach((p) => {
^
Error: Suppression path "/cloud9GenAIChatBotStack/RagEngines/SageMaker/Model/MultiAB24A/CodeBuildRole/DefaultPolicy/Resource" did not match any resource. This can occur when a resource does not exist or if a suppression is applied before a resource is created.
at /home/ubuntu/environment/aws-genai-llm-chatbot/node_modules/cdk-nag/src/nag-suppressions.ts:115:15
at Array.forEach (<anonymous>)
at Function.addResourceSuppressionsByPath (/home/ubuntu/environment/aws-genai-llm-chatbot/node_modules/cdk-nag/src/nag-suppressions.ts:98:15)
at new AwsGenAILLMChatbotStack (/home/ubuntu/environment/aws-genai-llm-chatbot/lib/aws-genai-llm-chatbot-stack.ts:273:25)
at Object.<anonymous> (/home/ubuntu/environment/aws-genai-llm-chatbot/bin/aws-genai-llm-chatbot.ts:13:1)
at Module._compile (node:internal/modules/cjs/loader:1369:14)
at Module.m._compile (/home/ubuntu/environment/aws-genai-llm-chatbot/node_modules/ts-node/src/index.ts:1618:23)
at Module._extensions..js (node:internal/modules/cjs/loader:1427:10)
at Object.require.extensions.<computed> [as .ts] (/home/ubuntu/environment/aws-genai-llm-chatbot/node_modules/ts-node/src/index.ts:1621:12)
at Module.load (node:internal/modules/cjs/loader:1206:32)
Subprocess exited with error 1
This happened after I re-ran npm run config and then npx cdk deploy. I had it use the same prefix as before because the stack already existed, so I provided the existing VPC ID. And I selected "no" for VPC endpoints since those were already created by the previous deployment.
Re-running everything did make the cohere rerank model show up in the config file, but now I'm having this new issue.
@ystoneman Can you share the config.json file?
Thanks for following up @azaylamba. Here's my config.json:
{
"prefix": "cloud9",
"vpc": {
"vpcId": "vpc-0906dfbea13ffd463",
"createVpcEndpoints": false
},
"privateWebsite": false,
"certificate": "",
"domain": "",
"cfGeoRestrictEnable": false,
"cfGeoRestrictList": [],
"bedrock": {
"enabled": true,
"region": "us-east-1"
},
"llms": {
"sagemaker": [],
"huggingfaceApiSecretArn": ""
},
"rag": {
"enabled": true,
"engines": {
"aurora": {
"enabled": false
},
"opensearch": {
"enabled": true
},
"kendra": {
"enabled": false,
"createIndex": false,
"external": [],
"enterprise": false
}
},
"embeddingsModels": [
{
"provider": "sagemaker",
"name": "intfloat/multilingual-e5-large",
"dimensions": 1024
},
{
"provider": "sagemaker",
"name": "sentence-transformers/all-MiniLM-L6-v2",
"dimensions": 384
},
{
"provider": "bedrock",
"name": "amazon.titan-embed-text-v1",
"dimensions": 1536
},
{
"provider": "bedrock",
"name": "amazon.titan-embed-image-v1",
"dimensions": 1024
},
{
"provider": "bedrock",
"name": "cohere.embed-english-v3",
"dimensions": 1024,
"default": true
},
{
"provider": "bedrock",
"name": "cohere.embed-multilingual-v3",
"dimensions": 1024
},
{
"provider": "openai",
"name": "text-embedding-ada-002",
"dimensions": 1536
}
],
"crossEncoderModels": [
{
"provider": "sagemaker",
"name": "cross-encoder/ms-marco-MiniLM-L-12-v2",
"default": true
},
{
"provider": "cohere",
"name": "rerank-english-v3.0"
}
]
}
}
Hi @ystoneman, this issue is due to some resource that was included in the nag-suppression rules to not be present any more. In particular /cloud9GenAIChatBotStack/RagEngines/SageMaker/Model/MultiAB24A/CodeBuildRole/DefaultPolicy/Resource. The current logic applies this suppression rule when opensearch or auroradb are selected as RAG Engines. Now I suppose that in your case you have disabled the SM cross encoder and only use the external cohere reranker which might explain this issue.
Hi @ystoneman, this issue is due to some resource that was included in the nag-suppression rules to not be present any more. In particular
/cloud9GenAIChatBotStack/RagEngines/SageMaker/Model/MultiAB24A/CodeBuildRole/DefaultPolicy/Resource. The current logic applies this suppression rule when opensearch or auroradb are selected as RAG Engines. Now I suppose that in your case you have disabled the SM cross encoder and only use the external cohere reranker which might explain this issue.
Hi @massi-ang, thanks for your response. I don't think I'm disabling the SageMaker cross-encoder, because both cross-encoders are specified in the config.json.
My desired behavior is that the SageMaker cross-encoder still gets deployed, but I want to provide the ability to toggle between that and the Cohere rerank-english-v3.0 external API endpoint in the cross-encoder dropdown in the UI.
Could you please clarify if there's a better way to handle this scenario in the suppression rules?