agent-squad icon indicating copy to clipboard operation
agent-squad copied to clipboard

[#131] DynamoDbChatStorage option to replace sensitive content by placeholder

Open pierrehanne opened this issue 8 months ago • 2 comments

Title:

Add support for replacing sensitive content in DynamoDbChatStorage [Issue #131] @brnaba-aws

Description:

Overview:

This PR introduces the ability to replace sensitive content in chat messages when using the DynamoDbChatStorage class. This is useful for ensuring that sensitive information (e.g., secret data, personal information) is masked before it is stored or retrieved from DynamoDB.

Changes:

  • Sensitive content masking: Added logic to mask sensitive content in messages before saving them to DynamoDB.
  • Sensitive content unmasking: Added functionality to reverse the masking when fetching messages.
  • New parameter: Introduced a sensitive_mappings parameter to the DynamoDbChatStorage class, which contains a dictionary of words/phrases to be masked and their replacements.
  • Helper method: Created the _anonymized_content method to handle the masking and unmasking of sensitive content in both directions (save and fetch).

How It Works:

  1. Masking sensitive content before saving: When saving a message using save_chat_messages(), the content is processed through the _anonymized_content method, where sensitive words (defined in sensitive_mappings) are replaced with asterisks (e.g., "secret" becomes "******").
  2. Unmasking sensitive content after fetching: When fetching messages using fetch_chat(), the same _anonymized_content method is used with the reverse=True flag to unmask previously masked content for retrieval.

Test Changes:

  • Added unit tests to verify that sensitive content is correctly masked before saving and unmasked when fetched.
  • Updated test assertions to ensure the masking/unmasking logic works as expected.

Why This Is Useful:

  • Security: Prevents storing or exposing sensitive data in plain text, making it more secure.
  • Compliance: Helps in ensuring that sensitive information is handled properly, in line with security and privacy standards.

Testing:

  • The unit tests now include scenarios where messages contain sensitive words like "secret" and "classified". These words are masked (e.g., secret******) when saved and unmasked when retrieved.

How to Test:

  1. Check the DynamoDbChatStorage class for the sensitive_mappings parameter.
  2. Test saving messages containing sensitive data and ensure they are masked.
  3. Test fetching messages and ensure the sensitive data is unmasked correctly.

pierrehanne avatar May 07 '25 21:05 pierrehanne

Hi @pierrehanne Thank you for the contribution. Can you also update the documentation (the DynamoDB Storage section) to explain this feature (and add some code examples) ?

cornelcroi avatar May 13 '25 15:05 cornelcroi

Hi @cornelcroi I update the documentation for dynamoDB

pierrehanne avatar May 13 '25 19:05 pierrehanne