authentik
authentik copied to clipboard
When connecting to an external postgres instance, the migrations did not succeed
Describe the bug I am running Authentik on AWS Fargate connecting to an RDS Serverless Aurora database. When starting Authentik for the first time, Django migrations fail to complete. This caused the authentik server instance to exit and then retry
To Reproduce Steps to reproduce the behavior:
- Create AWS RDS Aurora Serverless db cluster
- Configure Authentik in Fargate and connect to the RDS instance
- Check Authentik logs
Expected behavior Migrations should complete without errors
Logs
psycopg2.errors.ObjectInUse: cannot ALTER TABLE "authentik_stages_identification_identificationstage" because it has pending trigger events
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/local/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/lifecycle/migrate.py", line 91, in <module>
Running migrations:
Operations to perform:
Apply all migrations: auth, authentik_core, authentik_crypto, authentik_events, authentik_flows, authentik_outposts, authentik_policies, authentik_policies_dummy, authentik_policies_event_matcher, authentik_policies_expiry, authentik_policies_expression, authentik_policies_hibp, authentik_policies_password, authentik_policies_reputation, authentik_providers_ldap, authentik_providers_oauth2, authentik_providers_proxy, authentik_providers_saml, authentik_sources_ldap, authentik_sources_oauth, authentik_sources_plex, authentik_sources_saml, authentik_stages_authenticator_duo, authentik_stages_authenticator_sms, authentik_stages_authenticator_static, authentik_stages_authenticator_totp, authentik_stages_authenticator_validate, authentik_stages_authenticator_webauthn, authentik_stages_captcha, authentik_stages_consent, authentik_stages_deny, authentik_stages_dummy, authentik_stages_email, authentik_stages_identification, authentik_stages_invitation, authentik_stages_password, authentik_stages_prompt, authentik_stages_user_delete, authentik_stages_user_login, authentik_stages_user_logout, authentik_stages_user_write, authentik_tenants, contenttypes, guardian, otp_static, otp_totp, sessions
The authentik_stages_identification_identificationstage table contained a record for the default login flow. I was able to manually delete the record, allow the startup migrations to complete, and then re-create the deleted record. This resolved the issue as a temp workaround.
Version and Deployment (please complete the following information):
- authentik version: 2022.3.3
- Deployment: AWS Fargate; redis and worker on-task with server; PG hosted in RDS Aurora Serverless
Additional context Sample AWS CloudFormation snippet:
TaskDefinition:
Type: AWS::ECS::TaskDefinition
Properties:
Family: !Sub '${ServiceName}TaskDefinition'
NetworkMode: awsvpc
RequiresCompatibilities:
- FARGATE
RuntimePlatform:
CpuArchitecture: ARM64
Cpu: 1024
Memory: 3GB
ExecutionRoleArn: !GetAtt ExecutionRole.Arn
TaskRoleArn: !Ref TaskRole
ContainerDefinitions:
- Name: !Ref ServiceName
Image: !Ref ServiceImage
Command:
- server
PortMappings:
- ContainerPort: !Ref ContainerPort
Environment:
- Name: AUTHENTIK_REDIS__HOST
Value: localhost
- Name: AUTHENTIK_SECRET_KEY
Value: !Sub '{{resolve:secretsmanager:${AuthentikSecret}:SecretString}}'
- Name: AUTHENTIK_POSTGRESQL__USER
Value: !Sub '{{resolve:secretsmanager:${AuthentikSecret}:SecretString:username}}'
- Name: AUTHENTIK_POSTGRESQL__PASSWORD
Value: !Sub '{{resolve:secretsmanager:${AuthentikSecret}:SecretString:password}}'
- Name: AUTHENTIK_POSTGRESQL__HOST
Value: !ImportValue 'dev-shared-internal:database-host'
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-region: !Ref AWS::Region
awslogs-group: !Ref LogGroup
awslogs-stream-prefix: ecs
- Name: !Sub ${ServiceName}-worker
Image: !Ref ServiceImage
Command:
- worker
Environment:
- Name: AUTHENTIK_REDIS__HOST
Value: localhost
- Name: AUTHENTIK_SECRET_KEY
Value: !Sub '{{resolve:secretsmanager:${AuthentikSecret}:SecretString}}'
- Name: AUTHENTIK_POSTGRESQL__USER
Value: !Sub '{{resolve:secretsmanager:${AuthentikSecret}:SecretString:username}}'
- Name: AUTHENTIK_POSTGRESQL__PASSWORD
Value: !Sub '{{resolve:secretsmanager:${AuthentikSecret}:SecretString:password}}'
- Name: AUTHENTIK_POSTGRESQL__HOST
Value: !ImportValue 'dev-shared-internal:database-host'
- Name: AUTHENTIK_POSTGRESQL__NAME
Value: authentik
- Name: AUTHENTIK_OUTPOSTS__DISCOVER
Value: 'false'
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-region: !Ref AWS::Region
awslogs-group: !Ref LogGroup
awslogs-stream-prefix: ecs
- Name: Redis
Image: !Ref RedisImage
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-region: !Ref AWS::Region
awslogs-group: !Ref LogGroup
awslogs-stream-prefix: ecs
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
I have this issue as well but it's an external still local postgres DB. There doesn't seem to be any functionality impacted but the error spam is a bit much. I can attach more information if this issue gets any attention.
I will try to find a similar workaround though.
I havent been able to reproduce this (tested on RDS Postgres 14.something and the rest running in compose in an EC2 box)
However I did notice that the error comes from the pre-django migrations, i.e. https://github.com/goauthentik/authentik/blob/master/lifecycle/system_migrations
In this case it looks like https://github.com/goauthentik/authentik/blob/master/lifecycle/system_migrations/to_0_13_authentik.py is the culprit, and it runs because it still sees a passbook_core_user table, which is from before the rebrand.
If your instance is working fine you can just go into the database to delete that table, or do a cold migration (stop all existing instances, let the migrations run and then start them up again)
I have the exact same problem. I use the ~2022.5.3~ 2022.6.2 docker image with postresql 10 (on host) on a rocky linux 8.6 host.
When I use a newer version of postgres (tested with the provided docker-compose.yml with a postgres11 and postgres12) I don't run into this issue.
EDIT:
The error also occurs if you use a psql10 image in docker. So you should be able to reproduce the error:
---
version: '3.4'
services:
postgresql:
image: postgres:10-alpine
restart: unless-stopped
volumes:
- ./database:/var/lib/postgresql/data
environment:
- POSTGRES_PASSWORD=authentik
- POSTGRES_USER=authentik
- POSTGRES_DB=authentik
redis:
image: redis:alpine
restart: unless-stopped
server:
image: ${AUTHENTIK_IMAGE:-ghcr.io/goauthentik/server}:${AUTHENTIK_TAG:-2022.6.2}
restart: unless-stopped
command: server
environment:
AUTHENTIK_REDIS__HOST: redis
AUTHENTIK_POSTGRESQL__HOST: postgresql
AUTHENTIK_POSTGRESQL__USER: authentik
AUTHENTIK_POSTGRESQL__NAME: authentik
AUTHENTIK_POSTGRESQL__PASSWORD: authentik
AUTHENTIK_SECRET_KEY: authentik
env_file:
- .env
ports:
- "0.0.0.0:${AUTHENTIK_PORT_HTTP:-9000}:9000"
- "0.0.0.0:${AUTHENTIK_PORT_HTTPS:-9443}:9443"
Also, here is the error from the POV of the postgres server:
test-postgresql-1 | 2022-06-09 20:58:00.717 UTC [61] ERROR: cannot ALTER TABLE "authentik_stages_identification_identificationstage" because it has pending trigger events
test-postgresql-1 | 2022-06-09 20:58:00.717 UTC [61] STATEMENT: ALTER TABLE "authentik_stages_identification_identificationstage" ADD COLUMN "password_stage_id" uuid NULL CONSTRAINT "authentik_stages_ide_password_stage_id_8d68497a_fk_authentik" REFERENCES "authentik_stages_password_passwordstage"("stage_ptr_id") DEFERRABLE INITIALLY DEFERRED; SET CONSTRAINTS "authentik_stages_ide_password_stage_id_8d68497a_fk_authentik" IMMEDIATE
@BadAsstronaut do you also use a postgres 10 server?
To close the loop on my end, I fixed it by just doing a rolling update on the namespace authentik is in. It's the only thing in it's namespace atm so just all authentik components + external postgres instance. I had tracked it to a Django issue, but forgot which one.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.