hatchet Configuration Resets & Invalid Broadcast Address After Upgrading from v0.54.8 to v1 on K8s

Issue Description
After upgrading from v0.54.8 to v1 on Kubernetes (via helm upgrade), several configuration values were reset or changed unexpectedly:

Default Token Reset
- The default token was cleared, requiring reconfiguration.
SERVER_URL in hatchet-shared-config
- Reset to the default instead of preserving the prior value.
SERVER_AUTH_COOKIE_DOMAIN
- Was reset, impacting authentication flows.
SERVER_GRPC_BROADCAST_ADDRESS
- Changed from the expected service:7070 to localhost:7070, causing gRPC communication issues.

Because of these resets, authentication via both username/password and Google SSO was temporarily disrupted. After manually restoring each configuration variable to its intended value, the system now works correctly and I can see past workflow runs as expected.

Steps to Reproduce

Run Helm upgrade from v0.54.8 to v1 on Kubernetes.
Check the hatchet-shared-config and observe that SERVER_URL, SERVER_AUTH_COOKIE_DOMAIN, and other variables are reset.
Notice that the authentication token is missing and SERVER_GRPC_BROADCAST_ADDRESS is set to localhost:7070.

Expected Behavior

Existing configuration values should be preserved across upgrades.
SERVER_GRPC_BROADCAST_ADDRESS should remain pointed to the actual service address, e.g., service:7070.
Authentication tokens and cookie domains should be kept intact without manual intervention.

Actual Behavior

Several critical config values (token, SERVER_URL, SERVER_AUTH_COOKIE_DOMAIN, and SERVER_GRPC_BROADCAST_ADDRESS) defaulted to incorrect or blank settings.

Workaround

Manually update the Helm chart (or directly update hatchet-shared-config) with the correct values:
- Restore the default token.
- Set SERVER_URL properly.
- Set SERVER_AUTH_COOKIE_DOMAIN to the intended domain.
- Change SERVER_GRPC_BROADCAST_ADDRESS back to service:7070.

Environment

Helm Chart Version: Upgraded from 0.54.8 to v1
Kubernetes Version: (please specify)
Authentication: Username/Password & Google SSO

Additional Context

After resetting the environment variables, everything is functioning correctly, including viewing past workflow runs. The issue seems specifically tied to default configurations overriding previous values during the upgrade process.

Please investigate whether the Helm chart upgrade path or config migration scripts might be resetting these variables unintentionally.

Mar 21 '25 14:03 nmetaintro

For posterity, other config loading ~breakage between 0.54.14 and 0.55.21 discussed in discord: https://discord.com/channels/1088927970518909068/1213612885499052073/1352723656282869971

Mar 21 '25 20:03 knksmith57

I wonder if https://github.com/hatchet-dev/hatchet/pull/1385 helped here

Mar 31 '25 05:03 knksmith57

This issue has been stale for 30 days. Please update the issue or comment to keep it active. Otherwise, it will be closed in 5 days.

Sep 08 '25 08:09 github-actions[bot]