pr-agent
pr-agent copied to clipboard
Secret Masking
Feature request
The secrets should be masked before sending data to AI models for security
Motivation
Some repositories may contain sensitive information which should not be shared with AI models so these should be masked before sending data to ai model
:bulb: Auto-generated documentation-based answer:
This feature request is already addressed in PR-Agent/Qodo Merge through the "ignore" configuration options that allow you to exclude specific files or patterns from analysis.
You can ignore files or folders containing sensitive information using either glob patterns or regex patterns:
- Using glob patterns:
[ignore]
glob = ['**/secrets/*', '*.env']
- Using regex patterns:
[ignore]
regex = ['.*secret.*', '.*password.*']
These configurations can be set in your configuration file or passed as command-line arguments. For example:
/review --ignore.glob="['*.env']"
This ensures that sensitive files or files matching specific patterns are excluded from the data sent to AI models, effectively masking secrets from the analysis process.
Relevant Sources:
- https://qodo-merge-docs.qodo.ai//usage-guide/additional_configurations#ignoring-files-from-analysis
Thanks while this helps can we look into integrating tools like llm guard as well https://github.com/protectai/llm-guard?tab=readme-ov-file
Hi @akshaypatidar1999
Sensitive data should never be committed to git. Never.
For 'chat-gpt-in-ide' tools, I can understand why a masking feature might be needed - you are working in an intermediate state, and you might have local uncommitted files.
But for PRs, secrets should not appear. if they do, PR-Agent should give an alert. If we start masking the PR content to the AI, PR-Agent will fail to alert on that, as it should.
In addition, most AI providers today support zero data retention, so the harm of sending "sensitive" data (on the very rare unusual cases it might occur) is low.
Hi @mrT23,
I’ve encountered an issue with the --ignore-glob CLI argument when trying to ignore multiple file patterns. While it works correctly with a single pattern (e.g., --ignore-glob="['.properties']"), it fails to ignore files when multiple patterns are provided in a list (e.g., --ignore-glob="['.env', '*.properties']").