Configuration Resets & Invalid Broadcast Address After Upgrading from v0.54.8 to v1 on K8s
Issue Description
After upgrading from v0.54.8 to v1 on Kubernetes (via helm upgrade), several configuration values were reset or changed unexpectedly:
-
Default Token Reset
- The default token was cleared, requiring reconfiguration.
-
SERVER_URLinhatchet-shared-config- Reset to the default instead of preserving the prior value.
-
SERVER_AUTH_COOKIE_DOMAIN- Was reset, impacting authentication flows.
-
SERVER_GRPC_BROADCAST_ADDRESS- Changed from the expected
service:7070tolocalhost:7070, causing gRPC communication issues.
- Changed from the expected
Because of these resets, authentication via both username/password and Google SSO was temporarily disrupted. After manually restoring each configuration variable to its intended value, the system now works correctly and I can see past workflow runs as expected.
Steps to Reproduce
- Run Helm upgrade from
v0.54.8tov1on Kubernetes. - Check the
hatchet-shared-configand observe thatSERVER_URL,SERVER_AUTH_COOKIE_DOMAIN, and other variables are reset. - Notice that the authentication token is missing and
SERVER_GRPC_BROADCAST_ADDRESSis set tolocalhost:7070.
Expected Behavior
- Existing configuration values should be preserved across upgrades.
SERVER_GRPC_BROADCAST_ADDRESSshould remain pointed to the actual service address, e.g.,service:7070.- Authentication tokens and cookie domains should be kept intact without manual intervention.
Actual Behavior
- Several critical config values (token,
SERVER_URL,SERVER_AUTH_COOKIE_DOMAIN, andSERVER_GRPC_BROADCAST_ADDRESS) defaulted to incorrect or blank settings.
Workaround
- Manually update the Helm chart (or directly update
hatchet-shared-config) with the correct values:- Restore the default token.
- Set
SERVER_URLproperly. - Set
SERVER_AUTH_COOKIE_DOMAINto the intended domain. - Change
SERVER_GRPC_BROADCAST_ADDRESSback toservice:7070.
Environment
- Helm Chart Version: Upgraded from
0.54.8tov1 - Kubernetes Version: (please specify)
- Authentication: Username/Password & Google SSO
Additional Context
After resetting the environment variables, everything is functioning correctly, including viewing past workflow runs. The issue seems specifically tied to default configurations overriding previous values during the upgrade process.
Please investigate whether the Helm chart upgrade path or config migration scripts might be resetting these variables unintentionally.
For posterity, other config loading ~breakage between 0.54.14 and 0.55.21 discussed in discord: https://discord.com/channels/1088927970518909068/1213612885499052073/1352723656282869971
I wonder if https://github.com/hatchet-dev/hatchet/pull/1385 helped here
This issue has been stale for 30 days. Please update the issue or comment to keep it active. Otherwise, it will be closed in 5 days.