immich [BUG] Internal Server error 500 when searching

The bug

Just setup a new instance today, and when I try to search anything I get a popup from a window that says: Internal server error - 500 - Internal Server Error undefined

It also doesn't show me any photos with that search, and if I go to explore there are no faces recognized, etc.

Looking in the logs for the machine learning container, I'm getting some errors saying that the application is unable to start.

The OS that Immich Server is running on

Ubuntu 20.04.6 LTS

Version of Immich Server

v1.63.2

Version of Immich Mobile App

v1.63.0 build.103

Platform with the issue

[X] Server
[X] Web
[X] Mobile

Your docker-compose.yml content

version: "3.8"

services:
  immich-server:
    container_name: immich_server
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    command: [ "start.sh", "immich" ]
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
    env_file:
      - .env
    depends_on:
      - redis
      - database
      - typesense
    restart: always

  immich-microservices:
    container_name: immich_microservices
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    command: [ "start.sh", "microservices" ]
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
    env_file:
      - .env
    depends_on:
      - redis
      - database
      - typesense
    restart: always

  immich-machine-learning:
    container_name: immich_machine_learning
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}
    volumes:
      - model-cache:/cache
    env_file:
      - .env
    restart: always

  immich-web:
    container_name: immich_web
    image: ghcr.io/immich-app/immich-web:${IMMICH_VERSION:-release}
    env_file:
      - .env
    restart: always

  typesense:
    container_name: immich_typesense
    image: typesense/typesense:0.24.1@sha256:9bcff2b829f12074426ca044b56160ca9d777a0c488303469143dd9f8259d4dd
    environment:
      - TYPESENSE_API_KEY=${TYPESENSE_API_KEY}
      - TYPESENSE_DATA_DIR=/data
    logging:
      driver: none
    volumes:
      - tsdata:/data
    restart: always

  redis:
    container_name: immich_redis
    image: redis:6.2-alpine@sha256:70a7a5b641117670beae0d80658430853896b5ef269ccf00d1827427e3263fa3
    restart: always

  database:
    container_name: immich_postgres
    image: postgres:14-alpine@sha256:28407a9961e76f2d285dc6991e8e48893503cc3836a4755bbc2d40bcc272a441
    env_file:
      - .env
    environment:
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_USER: ${DB_USERNAME}
      POSTGRES_DB: ${DB_DATABASE_NAME}
      PG_DATA: /var/lib/postgresql/data
    volumes:
      - pgdata:/var/lib/postgresql/data
    restart: always

  immich-proxy:
    container_name: immich_proxy
    image: ghcr.io/immich-app/immich-proxy:${IMMICH_VERSION:-release}
    environment:
      # Make sure these values get passed through from the env file
      - IMMICH_SERVER_URL
      - IMMICH_WEB_URL
    ports:
      - 2283:8080
    depends_on:
      - immich-server
      - immich-web
    restart: always

volumes:
  pgdata:
  model-cache:
  tsdata:

Your .env content

###################################################################################
# Database
###################################################################################

# NOTE: The following four database variables support Docker secrets by adding a *_FILE suffix to the variable name
# See the docker-compose documentation on secrets for additional details: https://docs.docker.com/compose/compose-file/compose-file-v3/#secrets
DB_HOSTNAME=immich_postgres
DB_USERNAME=postgres
DB_PASSWORD=[redacted]
DB_DATABASE_NAME=immich

# Optional Database settings:
# DB_PORT=5432

###################################################################################
# Redis
###################################################################################

REDIS_HOSTNAME=immich_redis

# REDIS_URL will be used to pass custom options to ioredis.
# Example for Sentinel
# {"sentinels":[{"host":"redis-sentinel-node-0","port":26379},{"host":"redis-sentinel-node-1","port":26379},{"host":"redis-sentinel-node-2","port":26379}],"name":"redis-sentinel"}
# REDIS_URL=ioredis://eyJzZW50aW5lbHMiOlt7Imhvc3QiOiJyZWRpcy1zZW50aW5lbDEiLCJwb3J0IjoyNjM3OX0seyJob3N0IjoicmVkaXMtc2VudGluZWwyIiwicG9ydCI6MjYzNzl9XSwibmFtZSI6Im15bWFzdGVyIn0=

# Optional Redis settings:

# Note: these parameters are not automatically passed to the Redis Container
# to do so, please edit the docker-compose.yml file as well. Redis is not configured
# via environment variables, only redis.conf or the command line

# REDIS_PORT=6379
# REDIS_DBINDEX=0
# REDIS_USERNAME=
# REDIS_PASSWORD=
# REDIS_SOCKET=

###################################################################################
# Upload File Location
#
# This is the location where uploaded files are stored.
###################################################################################

UPLOAD_LOCATION=/home/michael/immich-uploads


###################################################################################
# Typesense
###################################################################################
TYPESENSE_API_KEY=[redacted]
# TYPESENSE_ENABLED=false
# TYPESENSE_URL uses base64 encoding for the nodes json.
# Example JSON that was used:
# [
#      { "host": "typesense-1.example.net", "port": "443", "protocol": "https" },
#      { "host": "typesense-2.example.net", "port": "443", "protocol": "https" },
#      { "host": "typesense-3.example.net", "port": "443", "protocol": "https" },
# ]
# TYPESENSE_URL=ha://WwogIHsgImhvc3QiOiAidHlwZXNlbnNlLTEuZXhhbXBsZS5uZXQiLCAicG9ydCI6ICI0NDMiLCAicHJvdG9jb2wiOiAiaHR0cHMiIH0sCiAgeyAiaG9zdCI6ICJ0eXBlc2Vuc2UtMi5leGFtcGxlLm5ldCIsICJwb3J0IjogIjQ0MyIsICJwcm90b2NvbCI6ICJodHRwcyIgfSwKICB7ICJob3N0IjogInR5cGVzZW5zZS0zLmV4YW1wbGUubmV0IiwgInBvcnQiOiAiNDQzIiwgInByb3RvY29sIjogImh0dHBzIiB9Cl0=

###################################################################################
# Reverse Geocoding
#
# Reverse geocoding is done locally which has a small impact on memory usage
# This memory usage can be altered by changing the REVERSE_GEOCODING_PRECISION variable
# This ranges from 0-3 with 3 being the most precise
# 3 - Cities > 500 population: ~200MB RAM
# 2 - Cities > 1000 population: ~150MB RAM
# 1 - Cities > 5000 population: ~80MB RAM
# 0 - Cities > 15000 population: ~40MB RAM
####################################################################################

# DISABLE_REVERSE_GEOCODING=false
# REVERSE_GEOCODING_PRECISION=3

####################################################################################
# WEB - Optional
#
# Custom message on the login page, should be written in HTML form.
# For example:
# PUBLIC_LOGIN_PAGE_MESSAGE="This is a demo instance of Immich.<br><br>Email: <i>[email protected]</i><br>Password: <i>demo</i>"
####################################################################################

PUBLIC_LOGIN_PAGE_MESSAGE=

####################################################################################
# Alternative Service Addresses - Optional
#
# This is an advanced feature for users who may be running their immich services on different hosts.
# It will not change which address or port that services bind to within their containers, but it will change where other services look for their peers.
# Note: immich-microservices is bound to 3002, but no references are made
####################################################################################

IMMICH_WEB_URL=http://immich-web:3000
IMMICH_SERVER_URL=http://immich-server:3001
IMMICH_MACHINE_LEARNING_URL=http://immich-machine-learning:3003

####################################################################################
# Alternative API's External Address - Optional
#
# This is an advanced feature used to control the public server endpoint returned to clients during Well-known discovery.
# You should only use this if you want mobile apps to access the immich API over a custom URL. Do not include trailing slash.
# NOTE: At this time, the web app will not be affected by this setting and will continue to use the relative path: /api
# Examples: http://localhost:3001, http://immich-api.example.com, etc
####################################################################################

#IMMICH_API_URL_EXTERNAL=http://localhost:3001

###################################################################################
# Immich Version - Optional
#
# This allows all immich docker images to be pinned to a specific version. By default,
# the version is "release" but could be a specific version, like "v1.59.0".
###################################################################################

IMMICH_VERSION=v1.63.2

Reproduction steps

1.Login to the web interface
2.Use the search bar to search any term
3.Sadly see the error 500 message

Additional information

No response

Jun 25 '23 17:06 d-kholin

Herp derp forgot to add the logs from the machine learning container.

INFO:     Waiting for application startup.
Could not find image processor class in the image processor config or the model config. Loading based on pattern matching with the model's feature extractor configuration.
/opt/venv/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py:65: UserWarning: Specified provider 'CUDAExecutionProvider' is not in available provider names.Available providers: 'CPUExecutionProvider'
  warnings.warn(
ERROR:    Traceback (most recent call last):
  File "/opt/venv/lib/python3.11/site-packages/starlette/routing.py", line 677, in lifespan
    async with self.lifespan_context(app) as maybe_state:
  File "/opt/venv/lib/python3.11/site-packages/starlette/routing.py", line 566, in __aenter__
    await self._router.startup()
  File "/opt/venv/lib/python3.11/site-packages/starlette/routing.py", line 654, in startup
    await handler()
  File "/usr/src/app/main.py", line 39, in startup_event
    await _model_cache.get_cached_model(model_name, model_type)
  File "/usr/src/app/cache.py", line 54, in get_cached_model
    model = get_model(model_name, model_type, **model_kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/app/models.py", line 35, in get_model
    model = _load_facial_recognition(
            ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/app/models.py", line 108, in _load_facial_recognition
    model = FaceAnalysis(
            ^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/insightface/app/face_analysis.py", line 31, in __init__
    model = model_zoo.get_model(onnx_file, **kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/insightface/model_zoo/model_zoo.py", line 96, in get_model
    model = router.get_model(providers=providers, provider_options=provider_options)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/insightface/model_zoo/model_zoo.py", line 40, in get_model
    session = PickableInferenceSession(self.onnx_file, **kwargs)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/insightface/model_zoo/model_zoo.py", line 25, in __init__
    super().__init__(model_path, **kwargs)
  File "/opt/venv/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 383, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
  File "/opt/venv/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 424, in _create_inference_session
    sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
onnxruntime.capi.onnxruntime_pybind11_state.InvalidProtobuf: [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Load model from /cache/cpu/facial-recognition/buffalo_l/models/buffalo_l/1k3d68.onnx failed:Protobuf parsing failed.

ERROR:    Application startup failed. Exiting.

Jun 25 '23 17:06 d-kholin

To be fair, I haven't been able to get search to work either. The only logs I see are from the proxy:

2023/06/25 17:39:47 [error] 38#38: *1761 upstream timed out (110: Operation timed out) while reading response header from upstream, client: 10.244.5.121, server: , request: "GET /search/__data.json?q=.arw&clip=true&x-sveltekit-invalidated=01 HTTP/1.1", upstream: "http://10.105.221.177:80/search/__data.json?q=.arw&clip=true&x-sveltekit-invalidated=01", host: "immich-unwind-k8s", referrer: "http://immich-unwind-k8s/photos"

Jun 25 '23 17:06 uhthomas

We're looking into this, not completely sure what's causing it. In the meantime, deleting your model cache volume should fix the issue.

Jun 25 '23 17:06 mertalev

We're looking into this, not completely sure what's causing it. In the meantime, deleting your model cache volume should fix the issue.

Interesting, removing that volume did indeed resolve the issue. Good to know

Jun 25 '23 17:06 d-kholin

Sorry for the noob question, but how do I remove the model cache volume? I'm facing this error as well. (version 1.69.0) I should probably note that I am using Immich on TrueNAS scale, so I can't just change the model cache volume location in the docker compose file.

Jul 25 '23 08:07 rickykresslein

same noob question, how remove the model cache volume?

Aug 20 '23 08:08 muava12

@muava12 @rickykresslein with the containers stopped: docker volume list (verify volumes) docker volume rm immich_model-cache

Aug 28 '23 22:08 CyberCois

just had the same problem, removing the model cache volume did work to solve it for me aswell

Aug 29 '23 14:08 Julian-1-2-3-4-5

I have this issue, however I'm unable to remove the volume even after stopping the containers...

I'm getting an error which saying this volume is currently in use.

Sep 21 '23 16:09 CtrlAltSudo

Removing the model cache volume did nothing for me

Oct 25 '23 19:10 Steccah

I have this issue, however I'm unable to remove the volume even after stopping the containers...

I'm getting an error which saying this volume is currently in use.

The machine learning container actually needs to be removed entirely. If it is just stopped the volume will still be in use. Try removing that container first then removing the volume.

Oct 25 '23 19:10 jrasm91

Removing the model cache volume did nothing for me

Sounds like a separate issue, please open a new issue.

Oct 25 '23 19:10 jrasm91

Sorry for the noob question, but how do I remove the model cache volume? I'm facing this error as well. (version 1.69.0) I should probably note that I am using Immich on TrueNAS scale, so I can't just change the model cache volume location in the docker compose file.

it's been a few days, but do you remember how you translated jakefrancois' commands to work on truenas scale, @rickykresslein ?

Jan 29 '24 18:01 makanimike

@makanimike Sorry, I don't remember what I ended up doing. I'm no longer using TrueNAS, so I'm able to manage immich just through Docker.

Jan 30 '24 10:01 rickykresslein