mcp-context-forge icon indicating copy to clipboard operation
mcp-context-forge copied to clipboard

[Bug]: 401 on privileged actions after cold restart despite valid login

Open InigoGastesi opened this issue 3 months ago • 1 comments

🐞 Bug Summary

After a cold restart of the server/Kubernetes node (e.g., powered off overnight), the Admin Web UI intermittently returns 401 Unauthorized for privileged actions even though I appear logged in. Affected actions include adding MCP servers, viewing metrics, and creating servers.


🧩 Affected Component

Select the area of the project impacted:

  • [x] mcpgateway - API
  • [x] mcpgateway - UI (admin panel)
  • [ ] mcpgateway.wrapper - stdio wrapper
  • [ ] Federation or Transports
  • [ ] CLI, Makefiles, or shell scripts
  • [ ] Container setup (Docker/Podman/Compose)
  • [ ] Other (explain below)

🔁 Steps to Reproduce

  1. Deploy ghcr.io/ibm/mcp-context-forge:latest on Kubernetes with UI and Admin API enabled and auth required (env excerpt below). DB is SQLite on a PVC at /data.
  2. Power off the host (or shut down the cluster) at end of day; power back on next day. (A cold start of the pod may also reproduce.)
  3. Log into the Admin UI (Basic Auth).
  4. Try any privileged action: Add MCP server, Metrics tab, Create server, etc.
  5. The UI shows “401 Unauthorized” responses for those API calls while the UI still indicates I’m logged in.

🤔 Expected Behavior

Admin actions should succeed when authenticated (200/201 responses), without requiring any extra steps after a cold restart.


📓 Logs / Error Output

Network panel shows 401 on endpoints such as /admin/servers, /admin/metrics, and related admin routes.
Pod logs primarily show 401 responses for those requests (no stacktrace).
⚠️ No secrets included. (Can provide additional sanitized logs if needed.)


🧠 Environment Info

You can retrieve most of this from the /version endpoint.

Key Value
Version or commit ghcr.io/ibm/mcp-context-forge:latest (as of 2025-08-27)
Runtime Containerized in Kubernetes (auth required; UI + Admin API enabled)
Platform / OS Kubernetes cluster (Namespace mcp)
Container Deployed via Deployment + PVC; Service is ClusterIP (HTTP to port 4444)

🧩 Additional Context (optional)

Kubernetes manifest (relevant bits):

env:
  - { name: HOST, value: "0.0.0.0" }
  - { name: MCPGATEWAY_UI_ENABLED, value: "true" }
  - { name: MCPGATEWAY_ADMIN_API_ENABLED, value: "true" }
  - { name: AUTH_REQUIRED, value: "true" }
  - name: BASIC_AUTH_USER
    valueFrom: { secretKeyRef: { name: mcpgateway-secret, key: BASIC_AUTH_USER } }
  - name: BASIC_AUTH_PASSWORD
    valueFrom: { secretKeyRef: { name: mcpgateway-secret, key: BASIC_AUTH_PASSWORD } }
  - name: JWT_SECRET_KEY
    valueFrom: { secretKeyRef: { name: mcpgateway-secret, key: JWT_SECRET_KEY } }
  - name: DATABASE_URL
    value: "sqlite:////data/gateway/mcp.db"

Notes / hypotheses to help triage:

  • If cookies are marked Secure and the UI is accessed over plain HTTP, the browser won’t send the cookie, which could present as 401s on admin routes after restart/session changes. Consider reproducing with HTTPS or, only for testing, SECURE_COOKIES=false.
  • Confirm whether admin auth relies on a cookie vs. header in the UI; check COOKIE_SAMESITE and related settings.
  • Verify that the JWT signing key (JWT_SECRET_KEY) and server time are stable across restarts (clock skew can invalidate tokens).

Potential directions:

  • Provide guidance on expected cookie settings for HTTP vs HTTPS deployments.
  • Clarify whether the UI refreshes/rotates tokens after pod restarts, and if any cache needs to be cleared.
  • Any known issues with SQLite + PVC on restart that could affect session storage would be helpful to rule in/out.

InigoGastesi avatar Aug 27 '25 07:08 InigoGastesi