authentik icon indicating copy to clipboard operation
authentik copied to clipboard

Authentik crashing (after Redis timeout)

Open Arragon5xpwm opened this issue 1 year ago • 13 comments

Describe the bug A clear and concise description of what the bug is.

To Reproduce Steps to reproduce the behavior:

Seems not to be reproduceable my using the (admin) UI

Expected behavior Authentik recovery the connection

Logs

{"error":"dial tcp: lookup redis-authentik: i/o timeout","event":"failed to connect to redis","level":"panic","logger":"authentik.outpost.proxyv2.application","name":"Apprise FowardAuth","timestamp":"2024-04-27T13:27:28+02:00"}

panic: (*logrus.Entry) 0xc0002ac150

goroutine 135 [running]:
github.com/sirupsen/logrus.(*Entry).log(0xc0002ac0e0, 0x0, {0xc00060a0e0, 0x1a})
	/go/pkg/mod/github.com/sirupsen/[email protected]/entry.go:260 +0x491
github.com/sirupsen/logrus.(*Entry).Log(0xc0002ac0e0, 0x0, {0xc000b54340?, 0x5?, 0x2?})
	/go/pkg/mod/github.com/sirupsen/[email protected]/entry.go:304 +0x48
github.com/sirupsen/logrus.(*Entry).Panic(...)
	/go/pkg/mod/github.com/sirupsen/[email protected]/entry.go:342
goauthentik.io/internal/outpost/proxyv2/application.(*Application).getStore(_, {0x48, {0xc00019e108, 0x12}, 0xc000114130, {0xc0002ec030, 0x22}, 0xc000812120, 0xc000114150, 0xc000114160, ...}, ...)
	/go/src/goauthentik.io/internal/outpost/proxyv2/application/session.go:76 +0x7e8
goauthentik.io/internal/outpost/proxyv2/application.NewApplication({0x48, {0xc00019e108, 0x12}, 0xc000114130, {0xc0002ec030, 0x22}, 0xc000812120, 0xc000114150, 0xc000114160, {{0xc0002ec120, ...}, ...}, ...}, ...)
	/go/src/goauthentik.io/internal/outpost/proxyv2/application/application.go:140 +0xf4a
goauthentik.io/internal/outpost/proxyv2.(*ProxyServer).Refresh(0xc00016a210)
	/go/src/goauthentik.io/internal/outpost/proxyv2/refresh.go:37 +0x567
goauthentik.io/internal/outpost/ak.(*APIController).OnRefresh(0xc000233180)
	/go/src/goauthentik.io/internal/outpost/ak/api.go:178 +0x314
goauthentik.io/internal/outpost/ak.(*APIController).startIntervalUpdater(0xc000233180)
	/go/src/goauthentik.io/internal/outpost/ak/api_ws.go:189 +0x17b
goauthentik.io/internal/outpost/ak.(*APIController).StartBackgroundTasks.func3()
	/go/src/goauthentik.io/internal/outpost/ak/api.go:216 +0x5d
created by goauthentik.io/internal/outpost/ak.(*APIController).StartBackgroundTasks in goroutine 16
	/go/src/goauthentik.io/internal/outpost/ak/api.go:214 +0x38d

Version and Deployment (please complete the following information):

  • authentik version: 2024.4.1
  • Deployment: docker-compose

Additional context continues after restart (restart-policy: unless-stopped)

Arragon5xpwm avatar Apr 27 '24 11:04 Arragon5xpwm

I've experienced similar errors yesterday and it felt like they coincided with the high load either on the node CPU or network. Can't say for sure right now. But feel pretty confident that i'm to reproduce it in my modest homelab by downloading multiple Linux ISOs

maxim-mityutko avatar May 08 '24 19:05 maxim-mityutko

I'm getting the same errors with authentik crashing every few hours even when idle. Did you manage to fix this?

rama31244 avatar May 15 '24 04:05 rama31244

Are you by any chance running authentik on unraid?

rama31244 avatar May 18 '24 23:05 rama31244

Are you by any chance running authentik on unraid?

Yes

Arragon5xpwm avatar May 19 '24 07:05 Arragon5xpwm

I'm betting its an unraid issue then, might try messing with some of the environment variables and I'll let you know how i go

rama31244 avatar May 19 '24 11:05 rama31244

I think i found my issue, the error seems to originate from the fact that I already had redis installed from when I ran authelia. I deleted the container and appdata folder + started again from a fresh redis container and now everything works as expected. I already changed to the official version of the redis container rather than the bitnami one so not sure if this helped too. Hope it works for you too

rama31244 avatar May 20 '24 20:05 rama31244

Actually scratch that, just crashed again with same error. Please let me know if you find a solution

rama31244 avatar May 22 '24 21:05 rama31244

I noticed that when I have heavy I/O, it seems to cause authentiks connection to redis to timeout and authentik can't recover from that. As a workaround I simply set restart-policy: unless-stopped

Arragon5xpwm avatar May 22 '24 21:05 Arragon5xpwm

Ok thanks. I might wait to see if it crashes again and then try your workaround. I wonder why this problem isn't reported by more people

rama31244 avatar May 22 '24 21:05 rama31244

I also face this issue:

{"error":"dial tcp: lookup authentik-redis-master: i/o timeout","event":"failed to connect to redis","level":"panic","logger":"authentik.outpost.proxyv2.application","name":"loki","timestamp":"2024-05-28T15:48:09Z"}
panic: (*logrus.Entry) 0xc000237340

goroutine 216 [running]:
github.com/sirupsen/logrus.(*Entry).log(0xc0002372d0, 0x0, {0xc000442960, 0x1a})
	/go/pkg/mod/github.com/sirupsen/[email protected]/entry.go:260 +0x491
github.com/sirupsen/logrus.(*Entry).Log(0xc0002372d0, 0x0, {0xc000c9a340?, 0x5?, 0x2?})
	/go/pkg/mod/github.com/sirupsen/[email protected]/entry.go:304 +0x48
github.com/sirupsen/logrus.(*Entry).Panic(...)
	/go/pkg/mod/github.com/sirupsen/[email protected]/entry.go:342
goauthentik.io/internal/outpost/proxyv2/application.(*Application).getStore(_, {0x2, {0xc00078e21c, 0x4}, 0xc00019aed0, {0xc0004420a0, 0x1a}, 0xc00078e280, 0xc00019aef0, 0xc00019af20, ...}, .
..)
	/go/src/goauthentik.io/internal/outpost/proxyv2/application/session.go:76 +0x7e8
goauthentik.io/internal/outpost/proxyv2/application.NewApplication({0x2, {0xc00078e21c, 0x4}, 0xc00019aed0, {0xc0004420a0, 0x1a}, 0xc00078e280, 0xc00019aef0, 0xc00019af20, {{0xc000192810, ...}, ...}, ...}, ...)
	/go/src/goauthentik.io/internal/outpost/proxyv2/application/application.go:140 +0xf4a
goauthentik.io/internal/outpost/proxyv2.(*ProxyServer).Refresh(0xc000178160)
	/go/src/goauthentik.io/internal/outpost/proxyv2/refresh.go:37 +0x567
goauthentik.io/internal/outpost/ak.(*APIController).OnRefresh(0xc000681880)
	/go/src/goauthentik.io/internal/outpost/ak/api.go:178 +0x314
goauthentik.io/internal/outpost/ak.(*APIController).startIntervalUpdater(0xc000681880)
	/go/src/goauthentik.io/internal/outpost/ak/api_ws.go:189 +0x17b
goauthentik.io/internal/outpost/ak.(*APIController).StartBackgroundTasks.func3()
	/go/src/goauthentik.io/internal/outpost/ak/api.go:216 +0x5d
created by goauthentik.io/internal/outpost/ak.(*APIController).StartBackgroundTasks in goroutine 192
	/go/src/goauthentik.io/internal/outpost/ak/api.go:214 +0x38d

Redis being unreachable is a problem in itself, but I was hoping Authentik would be able to retry and not crash straight away :/ The problem is that it invalidates the downstream user session, disconnecting them from Grafana/...

fayak avatar May 28 '24 16:05 fayak

Are you also on unraid? Setting the restart-policy: unless-stopped is working for me at the moment but the container still crashes about once a day

rama31244 avatar May 28 '24 20:05 rama31244

The problem still pops up once in a while, as @Arragon5xpwm mentioned this happens during heavy I/O. Authentik and Redis are deployed on Kubernetes.

maxim-mityutko avatar Jun 15 '24 21:06 maxim-mityutko

Hey I'm seeing the same error, seemingly due to I/O when my uptime service pings applications behind an unauthenticated route but still through Authentik. This is deployed on an RKE2 cluster via the provided Helm chart, with Redis being installed as a dedicated instance for Authentik from the chart.

mlhynfield avatar Aug 09 '24 20:08 mlhynfield

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.