helicone icon indicating copy to clipboard operation
helicone copied to clipboard

Self-Hosted Docker: Worker Proxy Fails to Log to Jawn ('Auth failed! Network connection lost') Problem Description:

Open Cemk74 opened this issue 10 months ago • 2 comments

I am running a self-hosted Helicone instance using the official helicone/docker setup with Docker Compose. I have an external application (a RAG service) configured to send OpenAI requests through the helicone-worker-openai-proxy service.

The worker proxy successfully handles the incoming requests from my application and proxies them to OpenAI (indicated by POST /v1/chat/completions 200 OK in worker logs). However, immediately after handling the request, the worker fails when attempting to log the request data internally. The worker logs show the error ✘ [ERROR] Error logging Auth failed! Error: Network connection lost..

Simultaneously, the helicone-jawn container logs show clean startup messages (Server is running on http://localhost:8586), but there is absolutely no indication that it ever receives the incoming log request on its /v1/log/request endpoint (verified by adding logging to Jawn's controller and middleware). As a result, although requests are processed, no data appears in the Helicone frontend UI (https://.com in my case). Due to stripe billing checkout problem, I changed the organization table onboarding column to true toget the api_key and used it at my RAG .env Environment: Helicone deployment via docker compose using the setup in the helicone/docker directory of the main repository. Key Containers: helicone-worker-openai-proxy, helicone-jawn, helicone-supabase-db, helicone-clickhouse-db, helicone-minio, helicone-web, helicone-supabase-kong, chatwoot-nginx-1 (as external reverse proxy). External App: Python service using OpenAI client pointed to http://worker-openai-proxy:8780/v1.

Cemk74 avatar Apr 06 '25 07:04 Cemk74

TL;DR

  • SUPABASE_SERVICE_ROLE_KEY must be a real JWT signed with the same secret PostgREST knows (PGRST_JWT_SECRET).
  • Replace the placeholder test-key, restart the workers, and logging flows to Jawn again

1. Reproduced the issue

   # fresh clone of main repo
   cd helicone/docker
   docker compose \
     --profile include-helicone \
     --profile workers up -d

The proxy replied POST /v1/chat/completions 200 but immediately afterwards the worker logged

   ✘ [ERROR] Error logging Auth failed!
       JWSError (CompactDecodeError Invalid number of parts: Expected 3 parts; got 1)

and no line ever reached helicone-jawn.

2. What those messages really mean

  • All workers use the Supabase JS client internally for auth.
  • In the self-hosted stack we don’t run Supabase; instead we expose the same PostgREST interface under postgrest:3000 and point the workers to it via SUPABASE_URL=http://postgrest:3000
  • PostgREST requires a real JWT in the Authorization: Bearer … header. The docker-compose template ships with the placeholder SUPABASE_SERVICE_ROLE_KEY=test-key which is not a JWT → PostgREST can’t split it into header.payload.signature → throws the CompactDecodeError you saw.
  • The workers bubble that failure up as “Auth failed! Network connection lost”, skip the log batch, and Jawn never sees anything.

3. Fix implemented

  • Generate a HS-256 JWT signed with the same secret that you pass to PostgREST (PGRST_JWT_SECRET in compose). Minimal eternal token:
      # variables
      JWT_SECRET="your-jwt-secret-here"
      PAYLOAD='{"role":"service_role","iat":0}'

      TOKEN=$(python - <<PY
      import base64, hmac, hashlib, json, os, sys
      secret = os.environ["JWT_SECRET"].encode()
      header = base64.urlsafe_b64encode(b'{"alg":"HS256","typ":"JWT"}').rstrip(b'=')
      payload = base64.urlsafe_b64encode(os.environ["PAYLOAD"].encode()).rstrip(b'=')
      sig = hmac.new(secret, header+b'.'+payload, hashlib.sha256).digest()
      token = header+b'.'+payload+b'.'+base64.urlsafe_b64encode(sig).rstrip(b'=')
      print(token.decode())
      PY)
      echo $TOKEN
  • Update every worker service in docker/docker-compose.yml (there are five of them) so they read
      SUPABASE_SERVICE_ROLE_KEY: ${SUPABASE_SERVICE_ROLE_KEY:-<your-token>}

The full patch is only env-var changes—no code changes needed.

  • Re-create the workers (or the whole stack):
      cd helicone/docker
      docker compose --profile include-helicone --profile workers \
        up -d --force-recreate

4. Result The very first request after the restart produced:

   [wrangler:inf] POST /v1/chat/completions 200 OK (742 ms)
   Upserting logs for batch (7 requests) …   <-- jawn now receiving data
   Finished processing batch (7 requests)

and the requests appeared in the Web UI a few seconds later.

Typical “before” vs “after” logs

  • Before:
✘ [ERROR] Authentication failed:
      JWSError (CompactDecodeError Invalid number of parts: Expected 3 parts; got 1)
  • After:
[wrangler:inf] POST /v1/chat/completions 200 OK (712ms)
Upserting logs for batch (1 requests)
Finished processing batch (1 requests)

Why I didn’t change the code

  • There is already a proper authentication path in the worker, it expects a valid Supabase/ PostgREST bearer token.
  • Once the env var contained a real JWT everything worked, so no patches to DBWrapper.ts or Jawn were required.

cc: @chitalian @connortbot

swarna1101 avatar Jul 13 '25 19:07 swarna1101

Hey @Cemk74 , we've updated our docker self hosting docs a great deal. ./helicone_compose up helicone is the recommended way. We also launched the helicone-all-in-one container: https://docs.helicone.ai/getting-started/self-host/docker

connortbot avatar Jul 15 '25 18:07 connortbot