immich icon indicating copy to clipboard operation
immich copied to clipboard

Redis crashing while scanning external library on Kubernetes

Open Echelon101 opened this issue 5 months ago • 6 comments

The bug

shortly after starting initial scan of my external libraries redis restarts and immich server restarts shortly after

External Library contains about 100k assets

The OS that Immich Server is running on

Ubuntu 24.04 LTS

Version of Immich Server

v1.114.0

Version of Immich Mobile App

v1.114.0

Platform with the issue

  • [X] Server
  • [ ] Web
  • [ ] Mobile

Your docker-compose.yml content

Deployed via Helm! Here is my modified server.yaml
{{- define "immich.server.hardcodedValues" -}}
global:
  nameOverride: server

env:
  {{ if .Values.immich.metrics.enabled }}
      IMMICH_METRICS: true
  {{ end }}
  {{- if .Values.immich.configuration }}
      IMMICH_CONFIG_FILE: /config/immich-config.yaml
  {{- end }}

{{- if .Values.immich.configuration }}
podAnnotations:
  checksum/config: {{ .Values.immich.configuration | toYaml | sha256sum }}
{{- end }}

service:
  main:
    enabled: true
    primary: true
    type: ClusterIP
    ports:
      http:
        enabled: true
        primary: true
        port: 3001
        protocol: HTTP
      metrics-api:
        enabled: {{ .Values.immich.metrics.enabled }}
        port: 8081
        protocol: HTTP
      metrics-ms:
        enabled: {{ .Values.immich.metrics.enabled }}
        port: 8082
        protocol: HTTP


serviceMonitor:
  main:
    enabled: {{ .Values.immich.metrics.enabled }}
    endpoints:
      - port: metrics-api
        scheme: http
      - port: metrics-ms
        scheme: http

probes:
  liveness: &probes
    enabled: true
    custom: true
    spec:
      httpGet:
        path: /api/server-info/ping
        port: http
      initialDelaySeconds: 0
      periodSeconds: 10
      timeoutSeconds: 1
      failureThreshold: 3
  readiness: *probes
  startup:
    enabled: true
    custom: true
    spec:
      httpGet:
        path: /api/server-info/ping
        port: http
      initialDelaySeconds: 0
      periodSeconds: 10
      timeoutSeconds: 1
      failureThreshold: 30

persistence:
{{- if .Values.immich.configuration }}
  config:
    enabled: true
    type: configMap
    name: {{ .Release.Name }}-immich-config
{{- end }}
  library:
    enabled: true
    mountPath: /usr/src/app/upload
    existingClaim: {{ .Values.immich.persistence.library.existingClaim }}
  family:
    enabled: true
    mountPath: /mnt/family
    existingClaim: photo-family
  storage:
    enabled: true
    mountPath: /mnt/storage
    existingClaim: photo-storage
  other:
    enabled: true
    mountPath: /mnt/other
    existingClaim: photo-other
{{- end }}
#  {{- range $.Values.immich.persistence.external }}
#  {{ .name }}
#    enabled: true
#    mountPath: {{ .mountPath }}
#    existingClaim: {{ .existingClaim }}
#  {{- end}}
#
{{ if .Values.server.enabled }}
{{- $ctx := deepCopy . -}}
{{- $_ := get .Values "server" | mergeOverwrite $ctx.Values -}}
{{- $_ = include "immich.server.hardcodedValues" . | fromYaml | merge $ctx.Values -}}
{{- include "bjw-s.common.loader.all" $ctx }}
{{ end }}

Your .env content

No special env set. Set by Helmchart

Reproduction steps

  1. Deploy Helm Chart
  2. Add External Library
  3. Start Library Refresh
  4. Wait for fail ...

Relevant log output

Immich Server Log:
[32m[Nest] 7  - [39m09/09/2024, 6:14:06 PM [32m    LOG[39m [33m[Microservices:Bootstrap][39m [32mImmich Microservices is running [v1.114.0] [PRODUCTION] [39m
[32m[Nest] 7  - [39m09/09/2024, 6:16:11 PM [32m    LOG[39m [33m[Microservices:LibraryService][39m [32mRefreshing library 5ee1f56a-bdab-450c-adff-d80d59bfb8bb[39m
Error: read ECONNRESET
    at TCP.onStreamRead (node:internal/stream_base_commons:218:20) {
  errno: -104,
  code: 'ECONNRESET',
  syscall: 'read'
}
[31m[Nest] 7  - [39m09/09/2024, 6:16:34 PM [31m  ERROR[39m [33m[Microservices:JobService][39m [31mUnable to run job handler (library/library-refresh): AbortError: Command aborted due to connection close[39m
[31m[Nest] 7  - [39m09/09/2024, 6:16:34 PM [31m  ERROR[39m [33m[Microservices:JobService][39m [31mAbortError: Command aborted due to connection close
    at abortError (/usr/src/app/node_modules/ioredis/built/redis/event_handler.js:81:17)
    at abortIncompletePipelines (/usr/src/app/node_modules/ioredis/built/redis/event_handler.js:105:28)
    at Socket.<anonymous> (/usr/src/app/node_modules/ioredis/built/redis/event_handler.js:140:13)
    at Object.onceWrapper (node:events:634:26)
    at Socket.emit (node:events:519:28)
    at TCP.<anonymous> (node:net:339:12)[39m
[31m[Nest] 7  - [39m09/09/2024, 6:16:34 PM [31m  ERROR[39m [33m[Microservices:JobService][39m [31mObject:[39m
{
  "id": "5ee1f56a-bdab-450c-adff-d80d59bfb8bb",
  "refreshModifiedFiles": false,
  "refreshAllFiles": false
}

missing 'error' handler on this Redis client
missing 'error' handler on this Redis client
missing 'error' handler on this Redis client
missing 'error' handler on this Redis client
missing 'error' handler on this Redis client
missing 'error' handler on this Redis client
missing 'error' handler on this Redis client
missing 'error' handler on this Redis client
missing 'error' handler on this Redis client
missing 'error' handler on this Redis client
missing 'error' handler on this Redis client
missing 'error' handler on this Redis client
missing 'error' handler on this Redis client
missing 'error' handler on this Redis client
missing 'error' handler on this Redis client
missing 'error' handler on this Redis client
missing 'error' handler on this Redis client
missing 'error' handler on this Redis client
missing 'error' handler on this Redis client
missing 'error' handler on this Redis client
Error: connect EPERM 10.43.91.65:6379 - Local (0.0.0.0:0)
    at internalConnect (node:net:1094:16)
    at defaultTriggerAsyncIdScope (node:internal/async_hooks:464:18)
    at GetAddrInfoReqWrap.emitLookup [as callback] (node:net:1400:9)
    at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:111:8) {
  errno: -1,
  code: 'EPERM',
  syscall: 'connect',
  address: '10.43.91.65',
  port: 6379
}
Error: connect EPERM 10.43.91.65:6379 - Local (0.0.0.0:0)
    at internalConnect (node:net:1094:16)
    at defaultTriggerAsyncIdScope (node:internal/async_hooks:464:18)
    at GetAddrInfoReqWrap.emitLookup [as callback] (node:net:1400:9)
    at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:111:8)
    at GetAddrInfoReqWrap.callbackTrampoline (node:internal/async_hooks:130:17) {
  errno: -1,
  code: 'EPERM',
  syscall: 'connect',
  address: '10.43.91.65',
  port: 6379
}
Error: connect EPERM 10.43.91.65:6379 - Local (0.0.0.0:0)
    at internalConnect (node:net:1094:16)
    at defaultTriggerAsyncIdScope (node:internal/async_hooks:464:18)
    at GetAddrInfoReqWrap.emitLookup [as callback] (node:net:1400:9)
    at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:111:8)
    at GetAddrInfoReqWrap.callbackTrampoline (node:internal/async_hooks:130:17) {
  errno: -1,
  code: 'EPERM',
  syscall: 'connect',
  address: '10.43.91.65',
  port: 6379
}
Error: connect EPERM 10.43.91.65:6379 - Local (0.0.0.0:0)
    at internalConnect (node:net:1094:16)
    at defaultTriggerAsyncIdScope (node:internal/async_hooks:464:18)
    at GetAddrInfoReqWrap.emitLookup [as callback] (node:net:1400:9)
    at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:111:8)
    at GetAddrInfoReqWrap.callbackTrampoline (node:internal/async_hooks:130:17) {
  errno: -1,
  code: 'EPERM',
  syscall: 'connect',
  address: '10.43.91.65',
  port: 6379
}
Error: connect EPERM 10.43.91.65:6379 - Local (0.0.0.0:0)
    at internalConnect (node:net:1094:16)
    at defaultTriggerAsyncIdScope (node:internal/async_hooks:464:18)
    at GetAddrInfoReqWrap.emitLookup [as callback] (node:net:1400:9)
    at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:111:8) {
  errno: -1,
  code: 'EPERM',
  syscall: 'connect',
  address: '10.43.91.65',
  port: 6379
}
Error: connect EPERM 10.43.91.65:6379 - Local (0.0.0.0:0)
    at internalConnect (node:net:1094:16)
    at defaultTriggerAsyncIdScope (node:internal/async_hooks:464:18)
    at GetAddrInfoReqWrap.emitLookup [as callback] (node:net:1400:9)
    at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:111:8)
    at GetAddrInfoReqWrap.callbackTrampoline (node:internal/async_hooks:130:17) {
  errno: -1,
  code: 'EPERM',
  syscall: 'connect',
  address: '10.43.91.65',
  port: 6379
}
Error: connect EPERM 10.43.91.65:6379 - Local (0.0.0.0:0)
    at internalConnect (node:net:1094:16)
    at defaultTriggerAsyncIdScope (node:internal/async_hooks:464:18)
    at GetAddrInfoReqWrap.emitLookup [as callback] (node:net:1400:9)
    at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:111:8)
    at GetAddrInfoReqWrap.callbackTrampoline (node:internal/async_hooks:130:17) {
  errno: -1,
  code: 'EPERM',
  syscall: 'connect',
  address: '10.43.91.65',
  port: 6379
}
Error: connect EPERM 10.43.91.65:6379 - Local (0.0.0.0:0)
    at internalConnect (node:net:1094:16)
    at defaultTriggerAsyncIdScope (node:internal/async_hooks:464:18)
    at GetAddrInfoReqWrap.emitLookup [as callback] (node:net:1400:9)
    at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:111:8)
    at GetAddrInfoReqWrap.callbackTrampoline (node:internal/async_hooks:130:17) {
  errno: -1,
  code: 'EPERM',
  syscall: 'connect',
  address: '10.43.91.65',
  port: 6379
}
Error: connect EPERM 10.43.91.65:6379 - Local (0.0.0.0:0)
    at internalConnect (node:net:1094:16)
    at defaultTriggerAsyncIdScope (node:internal/async_hooks:464:18)
    at GetAddrInfoReqWrap.emitLookup [as callback] (node:net:1400:9)
    at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:111:8)
    at GetAddrInfoReqWrap.callbackTrampoline (node:internal/async_hooks:130:17) {
  errno: -1,
  code: 'EPERM',
  syscall: 'connect',
  address: '10.43.91.65',
  port: 6379
}
Error: connect EPERM 10.43.91.65:6379 - Local (0.0.0.0:0)
    at internalConnect (node:net:1094:16)
    at defaultTriggerAsyncIdScope (node:internal/async_hooks:464:18)
    at GetAddrInfoReqWrap.emitLookup [as callback] (node:net:1400:9)
    at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:111:8)
    at GetAddrInfoReqWrap.callbackTrampoline (node:internal/async_hooks:130:17) {
  errno: -1,
  code: 'EPERM',
  syscall: 'connect',
  address: '10.43.91.65',
  port: 6379
}

REDIS Log:
1:C 09 Sep 2024 18:16:35.446 * oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 09 Sep 2024 18:16:35.446 * Redis version=7.2.5, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 09 Sep 2024 18:16:35.446 * Configuration loaded
1:M 09 Sep 2024 18:16:35.447 * monotonic clock: POSIX clock_gettime
1:M 09 Sep 2024 18:16:35.449 * Running mode=standalone, port=6379.
1:M 09 Sep 2024 18:16:35.451 * Server initialized
1:M 09 Sep 2024 18:16:35.467 * Reading RDB base file on AOF loading...
1:M 09 Sep 2024 18:16:35.467 * Loading RDB produced by version 7.2.5
1:M 09 Sep 2024 18:16:35.467 * RDB age 272 seconds
1:M 09 Sep 2024 18:16:35.467 * RDB memory usage when created 0.86 Mb
1:M 09 Sep 2024 18:16:35.467 * RDB is base AOF
1:M 09 Sep 2024 18:16:35.467 * Done loading RDB, keys loaded: 0, keys expired: 0.
1:M 09 Sep 2024 18:16:35.467 * DB loaded from base file appendonly.aof.1.base.rdb: 0.005 seconds
1:M 09 Sep 2024 18:16:35.539 * DB loaded from incr file appendonly.aof.1.incr.aof: 0.071 seconds
1:M 09 Sep 2024 18:16:35.539 * DB loaded from append only file: 0.077 seconds
1:M 09 Sep 2024 18:16:35.539 * Opening AOF incr file appendonly.aof.1.incr.aof on server start
1:M 09 Sep 2024 18:16:35.539 * Ready to accept connections tcp

Additional information

Helm Chart with tiny modifications for to directly add mounts for external libraries

Echelon101 avatar Sep 09 '24 18:09 Echelon101