helicone icon indicating copy to clipboard operation
helicone copied to clipboard

Add new bucket

Open H2Shami opened this issue 1 month ago • 3 comments

Ticket

Link to the ticket(s) this pull request addresses.

Component/Service

What part of Helicone does this affect?

  • [ ] Web (Frontend)
  • [x] Jawn (Backend)
  • [ ] Worker (Proxy)
  • [ ] Bifrost (Marketing)
  • [ ] AI Gateway
  • [ ] Packages
  • [x] Infrastructure/Docker
  • [ ] Documentation

Type of Change

  • [ ] Bug fix
  • [ ] New feature
  • [x] Breaking change
  • [ ] Documentation update
  • [ ] Performance improvement
  • [ ] Refactoring

Testing

  • [ ] Added/updated unit tests
  • [ ] Added/updated integration tests
  • [x] Tested locally
  • [ ] Verified in staging environment
  • [ ] E2E tests pass (if applicable)

Technical Considerations

  • [ ] Database migrations included (if needed)
  • [ ] API changes documented
  • [ ] Breaking changes noted
  • [ ] Performance impact assessed
  • [ ] Security implications reviewed

Dependencies

  • [ ] No external dependencies added
  • [ ] Dependencies added and documented
  • [x] Environment variables added/modified

Deployment Notes

  • [ ] No special deployment steps required
  • [ ] Database migrations need to run
  • [x] Environment variable changes required
  • [ ] Coordination with other teams needed

Context

prevents prompts from getting deleted after 90 days along with requests

Misc. Review Notes

Add S3_PROMPT_BUCKET_NAME="prompt-body-storage" to your jawn env file

H2Shami avatar Nov 27 '25 01:11 H2Shami

The latest updates on your projects. Learn more about Vercel for GitHub.

3 Skipped Deployments
Project Deployment Preview Comments Updated (UTC)
helicone Skipped Skipped Dec 1, 2025 7:08pm
helicone-bifrost Skipped Skipped Dec 1, 2025 7:08pm
helicone-eu Skipped Skipped Dec 1, 2025 7:08pm

vercel[bot] avatar Nov 27 '25 01:11 vercel[bot]

Claude finished @H2Shami's task —— View job


PR Code Review

Todo List:

  • [x] Read all changed files to understand implementation
  • [x] Analyze reviewer feedback and concerns
  • [x] Check for bugs, security issues, and performance problems
  • [x] Provide comprehensive review feedback with recommendations

Score: 7/10 - Good implementation with minor concerns, safe to merge with reviewer feedback addressed

Review Summary

This PR successfully implements a dedicated S3 bucket for prompt storage to prevent 90-day lifecycle deletion. The implementation is solid and addresses the core requirement. However, there are some issues raised by reviewers that should be addressed.

Key Issues Found:

1. Worker Configuration Inconsistency (Reviewer Concern)

The latest commit aa31773 removed worker-related S3 prompt bucket variables, which aligns with reviewer feedback that the worker doesn't actually use prompt buckets. This was correctly addressed.

2. Missing EU Bucket Configuration in supervisord.conf

File: supervisord.conf:83

# Current - missing prompt bucket creation
/usr/local/bin/mc mb --ignore-existing localminio/request-response-storage && exit 0"

# Should include both buckets
/usr/local/bin/mc mb --ignore-existing localminio/request-response-storage && /usr/local/bin/mc mb --ignore-existing localminio/prompt-body-storage && exit 0"

3. Bucket Naming Consistency Question

Reviewers noted that EU buckets typically have -eu suffix. The current naming pattern:

  • S3_PROMPT_BUCKET_NAME="prompt-body-storage"
  • EU_S3_PROMPT_BUCKET_NAME="prompt-body-storage" (should be prompt-body-storage-eu)

Code Quality Assessment:

PromptManager.ts - ✅ Excellent

  • Line 48: Properly uses dedicated prompt bucket environment variable
  • Security: S3 client properly configured with authentication
  • Error handling: Comprehensive error handling throughout
  • Code style: Clean, well-formatted TypeScript

Docker Configuration - ✅ Good

  • Lines 79, 154, 228: Consistent bucket environment variable setup across all profiles
  • MinIO setup: Properly creates both buckets on startup
  • Health checks: Appropriate dependencies and health checks

Environment Configuration - ✅ Good

  • Line 6: Added S3_PROMPT_BUCKET_NAME to example file
  • Clear documentation in PR description about required env var

Suggestions Summary:

Critical Fixes:

  1. Fix supervisord.conf MinIO setup - Add prompt bucket creation to line 83
  2. Consider EU naming consistency - Decide if EU buckets should have -eu suffix

Minor Improvements:

  1. Environment documentation - The PR description includes the required env var, which is helpful
  2. Test coverage - Consider adding integration tests for the new bucket separation

Security & Performance:

  • ✅ No security issues identified
  • ✅ Performance impact minimal - just bucket routing change
  • ✅ Proper authentication and access control maintained

Deployment Readiness:

  • ✅ Breaking change properly documented
  • ✅ Environment variable requirements clear
  • ✅ Database migrations not needed
  • ⚠️ Need to migrate existing prompt files from old bucket before deployment

The implementation correctly separates prompt storage from request/response storage, achieving the goal of preventing prompt data loss due to lifecycle policies. The code quality is high and follows established patterns.

claude[bot] avatar Nov 27 '25 01:11 claude[bot]

Greptile Overview

Greptile Summary

This PR adds a new dedicated S3 bucket (prompt-body-storage) for storing prompt data, separating it from the existing request/response storage bucket.

  • Updated PromptManager.ts to use S3_PROMPT_BUCKET_NAME instead of S3_BUCKET_NAME
  • Added environment variables and bucket configuration across all environments (local, staging, production)
  • Configured EU region routing to properly use the EU prompt bucket
  • Includes formatting improvements in PromptManager.ts

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • The changes are straightforward infrastructure updates that properly separate prompt storage from request/response storage. All environment configurations are consistently updated across development, staging, and production environments, including proper EU region support.
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
docker/docker-compose.yml 5/5 added prompt-body-storage bucket creation and S3_PROMPT_BUCKET_NAME environment variable for local development
valhalla/jawn/src/managers/prompt/PromptManager.ts 5/5 updated S3Client initialization to use dedicated prompt bucket and applied formatting fixes
worker/src/index.ts 5/5 added S3_PROMPT_BUCKET_NAME routing for EU region requests
worker/wrangler.toml 5/5 added S3_PROMPT_BUCKET_NAME and EU_S3_PROMPT_BUCKET_NAME configuration variables for all environments

Sequence Diagram

sequenceDiagram
    participant Client
    participant Worker
    participant Env Config
    participant PromptManager
    participant S3 Prompt Bucket
    participant S3 Request Bucket

    Note over Client,S3 Request Bucket: New Bucket Routing Flow

    Client->>Worker: Request (EU or US region)
    Worker->>Env Config: Check if EU request
    
    alt EU Request
        Env Config->>Env Config: Set S3_PROMPT_BUCKET_NAME = EU_S3_PROMPT_BUCKET_NAME
        Env Config->>Env Config: Set S3_BUCKET_NAME = EU_S3_BUCKET_NAME
    else US Request
        Env Config->>Env Config: Use default S3_PROMPT_BUCKET_NAME
        Env Config->>Env Config: Use default S3_BUCKET_NAME
    end
    
    Worker->>PromptManager: Initialize with environment
    PromptManager->>PromptManager: Create S3Client(S3_PROMPT_BUCKET_NAME)
    
    Note over PromptManager,S3 Request Bucket: Separate Storage
    
    PromptManager->>S3 Prompt Bucket: Store/Retrieve Prompt Data
    Note right of S3 Prompt Bucket: prompt-body-storage
    
    Worker->>S3 Request Bucket: Store/Retrieve Request/Response Data
    Note right of S3 Request Bucket: request-response-storage

greptile-apps[bot] avatar Nov 27 '25 01:11 greptile-apps[bot]