rag-chat-component icon indicating copy to clipboard operation
rag-chat-component copied to clipboard

Proposal: non-blocking CI report for recorded RAG chat output (PromptProof)

Open geminimir opened this issue 6 months ago • 0 comments

I’d like to add a tiny, report-only CI check that produces a one-glance HTML report showing whether a recorded RAG chat response still matches the expected shape, without any live model calls and without blocking merges.

Why this helps

  • Prevents silent drift in the component’s output contract (e.g., { message: string, sources?: string[] }) when making changes or refactoring.
  • Gives maintainers & contributors a clear, visual artifact on each PR: schema validation, basic safety checks, and cost/latency summary in one place.
  • Acts as lightweight regression testing without requiring a dedicated backend or API keys.
  • Non-blocking & deterministic — uses a fixed seed and 3 runs to avoid flaky results.
  • Keeps CI scoped to minimal new files, so it’s safe for a library repo without triggering unrelated builds.

Files to add

  1. .github/workflows/promptproof.yml
name: PromptProof
on:
  pull_request:
    paths:
      - ".github/workflows/promptproof.yml"
      - "promptproof.yaml"
      - "fixtures/promptproof/**"
jobs:
  proof:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: geminimir/promptproof-action@v0
        with:
          config: promptproof.yaml
          runs: 3
          seed: 1337
          max-run-cost: 0.50
          report-artifact: promptproof-report
          mode: report-only
  1. promptproof.yaml
mode: fail
format: html
fixtures:
  - path: fixtures/promptproof/rag_chat.json
checks:
  - id: rag_message_schema
    type: schema
    json_schema:
      type: object
      properties:
        output:
          type: object
          properties:
            message: { type: string, minLength: 1 }
            sources: { type: array, items: { type: string }, nullable: true }
          required: [message]
      required: [output]
budgets:
  max_run_cost: 0.50
stability:
  runs: 3
  seed: 1337
  1. fixtures/promptproof/rag_chat.json
{
  "record_id": "upstash-rag-001",
  "input": { "query": "What is vector search?" },
  "output": { "message": "Sample deterministic blurb.", "sources": ["https://example.com/source"] }
}

What maintainers get

  • A single HTML report artifact per PR (schema/regex/cost summary).
  • Zero live calls; easy to delete if unwanted.

References Sample report: https://geminimir.github.io/promptproof-action/reports/before.html

If this sounds okay, I’ll open a 3-file PR and can tweak the checks/paths to your preference.

Marketplace: https://github.com/marketplace/actions/promptproof-eval Demo project: https://github.com/geminimir/promptproof-demo-project

geminimir avatar Aug 15 '25 03:08 geminimir