RAG-style attack test and related enhancements

Open abutbul opened this issue 6 months ago • 0 comments

Overview

This pull request introduces several enhancements to the ps-fuzz testing framework, including a new test, embedding configurations, unit tests, some minor refactoring, and additional dependencies. These changes aim to improve the flexibility of the testing framework.

Changes

A new attack named "Hidden Parrot Attack" demonstrates how malicious instructions can be embedded in vector databases to compromise RAG system behavior. The implementation is located in [rag_poisoning.py]

Embedding Configuration:

Added support for embedding providers (ollama, open_ai) and models, including configuration for base URLs.
Embedding-specific base URLs can now be configured independently of the main provider URLs.

Base URL Support:

Introduced support for configuring base URLs for ollama and open_ai providers. (You can mix and match!)
Base URLs can be set via the configuration file, command-line arguments, or interactive menus.

Refactoring:

Refactored provider and model prompts to reduce duplication and improve maintainability.
Introduced helper functions for building client and embedding configurations.

Added Dependencies

chromadb: Added for vector database operations in the RAG poisoning attack.
tiktoken: Added for tokenization support in embedding-related operations.
- Updated setup.py and pyproject.toml (nodding at legacy package setup) to include the new dependencies.

Impact

The embedding configuration enhancements enable more advanced attack simulations, further strengthening our testing framework. rag_poisoning attack demonstrate easily exploitable vulnerability in many vector-DB backed RAG pipelines.

Testing

The new test have been integrated into the existing test suite and validated for correctness and performance impact.
Skipped tests are now properly reported with detailed logs.

P.S. I realize adding skipping status to tests is out of scope, however, I have ran some edge tests with missing libraries/configuration. Test pipeline errors reported as failed(vulnerable) in the default summary view rather than reporting as skipped. There is existing boilerplate for errors(⚠) to avoid breaking legacy, I added skipped. All that said, I may be missing a better way to report.

Aug 29 '25 19:08 abutbul