authentik icon indicating copy to clipboard operation
authentik copied to clipboard

When connecting to an external postgres instance, the migrations did not succeed

Open BadAsstronaut opened this issue 3 years ago • 5 comments

Describe the bug I am running Authentik on AWS Fargate connecting to an RDS Serverless Aurora database. When starting Authentik for the first time, Django migrations fail to complete. This caused the authentik server instance to exit and then retry

To Reproduce Steps to reproduce the behavior:

  • Create AWS RDS Aurora Serverless db cluster
  • Configure Authentik in Fargate and connect to the RDS instance
  • Check Authentik logs

Expected behavior Migrations should complete without errors

Logs

psycopg2.errors.ObjectInUse: cannot ALTER TABLE "authentik_stages_identification_identificationstage" because it has pending trigger events
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/local/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/lifecycle/migrate.py", line 91, in <module>
Running migrations:
Operations to perform:
Apply all migrations: auth, authentik_core, authentik_crypto, authentik_events, authentik_flows, authentik_outposts, authentik_policies, authentik_policies_dummy, authentik_policies_event_matcher, authentik_policies_expiry, authentik_policies_expression, authentik_policies_hibp, authentik_policies_password, authentik_policies_reputation, authentik_providers_ldap, authentik_providers_oauth2, authentik_providers_proxy, authentik_providers_saml, authentik_sources_ldap, authentik_sources_oauth, authentik_sources_plex, authentik_sources_saml, authentik_stages_authenticator_duo, authentik_stages_authenticator_sms, authentik_stages_authenticator_static, authentik_stages_authenticator_totp, authentik_stages_authenticator_validate, authentik_stages_authenticator_webauthn, authentik_stages_captcha, authentik_stages_consent, authentik_stages_deny, authentik_stages_dummy, authentik_stages_email, authentik_stages_identification, authentik_stages_invitation, authentik_stages_password, authentik_stages_prompt, authentik_stages_user_delete, authentik_stages_user_login, authentik_stages_user_logout, authentik_stages_user_write, authentik_tenants, contenttypes, guardian, otp_static, otp_totp, sessions

The authentik_stages_identification_identificationstage table contained a record for the default login flow. I was able to manually delete the record, allow the startup migrations to complete, and then re-create the deleted record. This resolved the issue as a temp workaround.

Version and Deployment (please complete the following information):

  • authentik version: 2022.3.3
  • Deployment: AWS Fargate; redis and worker on-task with server; PG hosted in RDS Aurora Serverless

Additional context Sample AWS CloudFormation snippet:

  TaskDefinition:
    Type: AWS::ECS::TaskDefinition
    Properties:
      Family: !Sub '${ServiceName}TaskDefinition'
      NetworkMode: awsvpc
      RequiresCompatibilities:
        - FARGATE
      RuntimePlatform:
        CpuArchitecture: ARM64
      Cpu: 1024
      Memory: 3GB
      ExecutionRoleArn: !GetAtt ExecutionRole.Arn
      TaskRoleArn: !Ref TaskRole
      ContainerDefinitions:
        - Name: !Ref ServiceName
          Image: !Ref ServiceImage
          Command:
            - server
          PortMappings:
            - ContainerPort: !Ref ContainerPort
          Environment:
            - Name: AUTHENTIK_REDIS__HOST
              Value: localhost
            - Name: AUTHENTIK_SECRET_KEY
              Value: !Sub '{{resolve:secretsmanager:${AuthentikSecret}:SecretString}}'
            - Name: AUTHENTIK_POSTGRESQL__USER
              Value: !Sub '{{resolve:secretsmanager:${AuthentikSecret}:SecretString:username}}'
            - Name: AUTHENTIK_POSTGRESQL__PASSWORD
              Value: !Sub '{{resolve:secretsmanager:${AuthentikSecret}:SecretString:password}}'
            - Name: AUTHENTIK_POSTGRESQL__HOST
              Value: !ImportValue 'dev-shared-internal:database-host'
          LogConfiguration:
            LogDriver: awslogs
            Options:
              awslogs-region: !Ref AWS::Region
              awslogs-group: !Ref LogGroup
              awslogs-stream-prefix: ecs
        - Name: !Sub ${ServiceName}-worker
          Image: !Ref ServiceImage
          Command:
            - worker
          Environment:
            - Name: AUTHENTIK_REDIS__HOST
              Value: localhost
            - Name: AUTHENTIK_SECRET_KEY
              Value: !Sub '{{resolve:secretsmanager:${AuthentikSecret}:SecretString}}'
            - Name: AUTHENTIK_POSTGRESQL__USER
              Value: !Sub '{{resolve:secretsmanager:${AuthentikSecret}:SecretString:username}}'
            - Name: AUTHENTIK_POSTGRESQL__PASSWORD
              Value: !Sub '{{resolve:secretsmanager:${AuthentikSecret}:SecretString:password}}'
            - Name: AUTHENTIK_POSTGRESQL__HOST
              Value: !ImportValue 'dev-shared-internal:database-host'
            - Name: AUTHENTIK_POSTGRESQL__NAME
              Value: authentik
            - Name: AUTHENTIK_OUTPOSTS__DISCOVER
              Value: 'false'
          LogConfiguration:
            LogDriver: awslogs
            Options:
              awslogs-region: !Ref AWS::Region
              awslogs-group: !Ref LogGroup
              awslogs-stream-prefix: ecs
        - Name: Redis
          Image: !Ref RedisImage
          LogConfiguration:
            LogDriver: awslogs
            Options:
              awslogs-region: !Ref AWS::Region
              awslogs-group: !Ref LogGroup
              awslogs-stream-prefix: ecs

BadAsstronaut avatar Mar 22 '22 19:03 BadAsstronaut

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar May 21 '22 20:05 stale[bot]

I have this issue as well but it's an external still local postgres DB. There doesn't seem to be any functionality impacted but the error spam is a bit much. I can attach more information if this issue gets any attention.

I will try to find a similar workaround though.

h3mmy avatar May 23 '22 14:05 h3mmy

I havent been able to reproduce this (tested on RDS Postgres 14.something and the rest running in compose in an EC2 box)

However I did notice that the error comes from the pre-django migrations, i.e. https://github.com/goauthentik/authentik/blob/master/lifecycle/system_migrations

In this case it looks like https://github.com/goauthentik/authentik/blob/master/lifecycle/system_migrations/to_0_13_authentik.py is the culprit, and it runs because it still sees a passbook_core_user table, which is from before the rebrand.

If your instance is working fine you can just go into the database to delete that table, or do a cold migration (stop all existing instances, let the migrations run and then start them up again)

BeryJu avatar May 23 '22 17:05 BeryJu

I have the exact same problem. I use the ~2022.5.3~ 2022.6.2 docker image with postresql 10 (on host) on a rocky linux 8.6 host.

When I use a newer version of postgres (tested with the provided docker-compose.yml with a postgres11 and postgres12) I don't run into this issue.

EDIT:

The error also occurs if you use a psql10 image in docker. So you should be able to reproduce the error:

---
version: '3.4'

services:
  postgresql:
    image: postgres:10-alpine
    restart: unless-stopped
    volumes:
      - ./database:/var/lib/postgresql/data
    environment:
      - POSTGRES_PASSWORD=authentik
      - POSTGRES_USER=authentik
      - POSTGRES_DB=authentik
  redis:
    image: redis:alpine
    restart: unless-stopped
  server:
    image: ${AUTHENTIK_IMAGE:-ghcr.io/goauthentik/server}:${AUTHENTIK_TAG:-2022.6.2}
    restart: unless-stopped
    command: server
    environment:
      AUTHENTIK_REDIS__HOST: redis
      AUTHENTIK_POSTGRESQL__HOST: postgresql
      AUTHENTIK_POSTGRESQL__USER: authentik
      AUTHENTIK_POSTGRESQL__NAME: authentik
      AUTHENTIK_POSTGRESQL__PASSWORD: authentik
      AUTHENTIK_SECRET_KEY: authentik
    env_file:
      - .env
    ports:
      - "0.0.0.0:${AUTHENTIK_PORT_HTTP:-9000}:9000"
      - "0.0.0.0:${AUTHENTIK_PORT_HTTPS:-9443}:9443"

Also, here is the error from the POV of the postgres server:

test-postgresql-1  | 2022-06-09 20:58:00.717 UTC [61] ERROR:  cannot ALTER TABLE "authentik_stages_identification_identificationstage" because it has pending trigger events
test-postgresql-1  | 2022-06-09 20:58:00.717 UTC [61] STATEMENT:  ALTER TABLE "authentik_stages_identification_identificationstage" ADD COLUMN "password_stage_id" uuid NULL CONSTRAINT "authentik_stages_ide_password_stage_id_8d68497a_fk_authentik" REFERENCES "authentik_stages_password_passwordstage"("stage_ptr_id") DEFERRABLE INITIALLY DEFERRED; SET CONSTRAINTS "authentik_stages_ide_password_stage_id_8d68497a_fk_authentik" IMMEDIATE

@BadAsstronaut do you also use a postgres 10 server?

dermalikmann avatar Jun 09 '22 20:06 dermalikmann

To close the loop on my end, I fixed it by just doing a rolling update on the namespace authentik is in. It's the only thing in it's namespace atm so just all authentik components + external postgres instance. I had tracked it to a Django issue, but forgot which one.

h3mmy avatar Jun 11 '22 22:06 h3mmy

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Aug 10 '22 23:08 stale[bot]