GPT-RAG Responsible AI improvement with Content Safety/Content Filtering (multiple items)

Responsible AI improvement with Content Safety/Content Filtering (multiple items)

Open gbecerra1982 opened this issue 1 year ago • 2 comments

List of tasks: (see this item description below)

[x] #131
[x] Add protected material check
[x] Add prompt attack check Prompt shield
[x] Add harm detection Text moderation (harm categories)
[ ] Review Content Safety Features to check if it adds value over the AOAI OOTB content filtering.
[ ] Migrate blocked words list (AOAI filtering)

Item description

User should be able to define what functions from Responsible AI plugin he/she wants to use as a guardrail when receiving the ask from the user and before sending the response back to the user and its thresholds.

List of functions:

Unfairness
Harm detection Text moderation (harm categories)
Prompt attacks Prompt shield
Protected material
Groundedness check (migrate to CS)
Blocked words (migrate to CS)

Notes:

Users can configure what functions they want to use and thresholds in gpt-rag configuration.
Items that can be met using native Azure OpenAI content filtering should do it so we save API calls.
Orchestator responses should contain metadata information about guardrails responses so future APIM or Security function can check them to enforce.

Out of scope items to be handled in a separated item:

IaaS (bicep) update to create and configure content safety service
Architecture redesign: Create a new Azure Function "Custom Security Policy" that will receive the text from the Orchestrator and validate the content does not have violence, sexual, etc. This function is the beggining of Security Function to add controls of security to the platform, further will be introduced additional security controls.

We need to prepare this function so the Security Team can add additional controls (i.e. Microsoft Purview, etc)

References: https://techcommunity.microsoft.com/t5/ai-azure-ai-services-blog/azure-ai-announces-prompt-shields-for-jailbreak-and-indirect/ba-p/4099140

https://techcommunity.microsoft.com/t5/ai-azure-ai-services-blog/detect-and-mitigate-ungrounded-model-outputs/ba-p/4099261%23:~:text=Today%2520Azure%2520AI%2520makes%2520this,Copilots%2520and%2520document%2520summarization%2520applications.

Oct 24 '23 11:10 gbecerra1982

GPT-RAG GPT-RAG copied to clipboard

Responsible AI improvement with Content Safety/Content Filtering (multiple items)

GPT-RAG
GPT-RAG copied to clipboard