Search not working when using OpenVINO

Open fwmone opened this issue 11 months ago • 5 comments

The bug

When enabling the immich-machine-learning openvino image, search is not working anymore but throws an HTTP 500 error. The immich-machine-learning log throws:

`[03/29/24 09:48:06] INFO Loading clip model 'ViT-B-32__openai' to memory

2024-03-29 09:48:09.988640928 [E:onnxruntime:, inference_session.cc:1985 Initialize] Encountered unknown exception in Initialize()

[03/29/24 09:48:09] ERROR Exception in ASGI application

                         ╭─────── Traceback (most recent call last) ───────╮

                         │ /usr/src/app/main.py:116 in predict             │

                         │                                                 │

                         │   113 │   except orjson.JSONDecodeError:        │

                         │   114 │   │   raise HTTPException(400, f"Invali │

                         │   115 │                                         │

                         │ ❱ 116 │   model = await load(await model_cache. │

                         │       ttl=settings.model_ttl, **kwargs))        │

                         │   117 │   model.configure(**kwargs)             │

                         │   118 │   outputs = await run(model.predict, in │

                         │   119 │   return ORJSONResponse(outputs)        │

                         │                                                 │

                         │ /usr/src/app/main.py:137 in load                │

                         │                                                 │

                         │   134 │   │   │   model.load()                  │

                         │   135 │                                         │

                         │   136 │   try:                                  │

                         │ ❱ 137 │   │   await run(_load, model)           │

                         │   138 │   │   return model                      │

                         │   139 │   except (OSError, InvalidProtobuf, Bad │

                         │   140 │   │   log.warning(                      │

                         │                                                 │

                         │ /usr/src/app/main.py:125 in run                 │

                         │                                                 │

                         │   122 async def run(func: Callable[..., Any], i │

                         │   123 │   if thread_pool is None:               │

                         │   124 │   │   return func(inputs)               │

                         │ ❱ 125 │   return await asyncio.get_running_loop │

                         │   126                                           │

                         │   127                                           │

                         │   128 async def load(model: InferenceModel) ->  │

                         │                                                 │

                         │ /usr/lib/python3.10/concurrent/futures/thread.p │

                         │ y:58 in run                                     │

                         │                                                 │

                         │ /usr/src/app/main.py:134 in _load               │

                         │                                                 │

                         │   131 │                                         │

                         │   132 │   def _load(model: InferenceModel) -> N │

                         │   133 │   │   with lock:                        │

                         │ ❱ 134 │   │   │   model.load()                  │

                         │   135 │                                         │

                         │   136 │   try:                                  │

                         │   137 │   │   await run(_load, model)           │

                         │                                                 │

                         │ /usr/src/app/models/base.py:52 in load          │

                         │                                                 │

                         │    49 │   │   │   return                        │

                         │    50 │   │   self.download()                   │

                         │    51 │   │   log.info(f"Loading {self.model_ty │

                         │       to memory")                               │

                         │ ❱  52 │   │   self._load()                      │

                         │    53 │   │   self.loaded = True                │

                         │    54 │                                         │

                         │    55 │   def predict(self, inputs: Any, **mode │

                         │                                                 │

                         │ /usr/src/app/models/clip.py:146 in _load        │

                         │                                                 │

                         │   143 │   │   super().__init__(clean_name(model │

                         │   144 │                                         │

                         │   145 │   def _load(self) -> None:              │

                         │ ❱ 146 │   │   super()._load()                   │

                         │   147 │   │   self._load_tokenizer()            │

                         │   148 │   │                                     │

                         │   149 │   │   size: list[int] | int = self.prep │

                         │                                                 │

                         │ /usr/src/app/models/clip.py:36 in _load         │

                         │                                                 │

                         │    33 │   def _load(self) -> None:              │

                         │    34 │   │   if self.mode == "text" or self.mo │

                         │    35 │   │   │   log.debug(f"Loading clip text │

                         │ ❱  36 │   │   │   self.text_model = self._make_ │

                         │    37 │   │   │   log.debug(f"Loaded clip text  │

                         │    38 │   │                                     │

                         │    39 │   │   if self.mode == "vision" or self. │

                         │                                                 │

                         │ /usr/src/app/models/base.py:117 in              │

                         │ _make_session                                   │

                         │                                                 │

                         │   114 │   │   │   case ".armnn":                │

                         │   115 │   │   │   │   session = AnnSession(mode │

                         │   116 │   │   │   case ".onnx":                 │

                         │ ❱ 117 │   │   │   │   session = ort.InferenceSe │

                         │   118 │   │   │   │   │   model_path.as_posix() │

                         │   119 │   │   │   │   │   sess_options=self.ses │

                         │   120 │   │   │   │   │   providers=self.provid │

                         │                                                 │

                         │ /opt/venv/lib/python3.10/site-packages/onnxrunt │

                         │ ime/capi/onnxruntime_inference_collection.py:41 │

                         │ 9 in __init__                                   │

                         │                                                 │

                         │    416 │   │   disabled_optimizers = kwargs["di │

                         │        kwargs else None                         │

                         │    417 │   │                                    │

                         │    418 │   │   try:                             │

                         │ ❱  419 │   │   │   self._create_inference_sessi │

                         │        disabled_optimizers)                     │

                         │    420 │   │   except (ValueError, RuntimeError │

                         │    421 │   │   │   if self._enable_fallback:    │

                         │    422 │   │   │   │   try:                     │

                         │                                                 │

                         │ /opt/venv/lib/python3.10/site-packages/onnxrunt │

                         │ ime/capi/onnxruntime_inference_collection.py:48 │

                         │ 3 in _create_inference_session                  │

                         │                                                 │

                         │    480 │   │   │   disabled_optimizers = set(di │

                         │    481 │   │                                    │

                         │    482 │   │   # initialize the C++ InferenceSe │

                         │ ❱  483 │   │   sess.initialize_session(provider │

                         │    484 │   │                                    │

                         │    485 │   │   self._sess = sess                │

                         │    486 │   │   self._sess_options = self._sess. │

                         ╰─────────────────────────────────────────────────╯

                         RuntimeException: [ONNXRuntimeError] : 6 :         

                         RUNTIME_EXCEPTION : Encountered unknown exception  

                         in Initialize()                                    `

Search works well without openvino enabled.

My hwaccel.ml.yml:

`version: "3.8"

Configurations for hardware-accelerated machine learning

If using Unraid or another platform that doesn't allow multiple Compose files,

you can inline the config for a backend by copying its contents

into the immich-machine-learning service in the docker-compose.yml file.

See https://immich.app/docs/features/ml-hardware-acceleration for info on usage.

services: armnn: devices: - /dev/mali0:/dev/mali0 volumes: - /lib/firmware/mali_csffw.bin:/lib/firmware/mali_csffw.bin:ro # Mali firmware for your chipset (not always required depending on the driver) - /usr/lib/libmali.so:/usr/lib/libmali.so:ro # Mali driver for your chipset (always required)

cpu:

cuda: deploy: resources: reservations: devices: - driver: nvidia count: 1 capabilities: - gpu - compute - video

openvino: device_cgroup_rules: - "c 189:* rmw" devices: - /dev/dri:/dev/dri volumes: - /dev/bus/usb:/dev/bus/usb

openvino-wsl: devices: - /dev/dri:/dev/dri - /dev/dxg:/dev/dxg volumes: - /dev/bus/usb:/dev/bus/usb - /usr/lib/wsl:/usr/lib/wsl `

I run an Asustor AS6702T NAS with an Intel Celeron N5105 CPU. This problem started with v1.99.0.

The OS that Immich Server is running on

ADM 4.2.6RPI1 (Asustor NAS)

Version of Immich Server

v1.100.0

Version of Immich Mobile App

(not used)

Platform with the issue

[X] Server
[X] Web
[ ] Mobile

Your docker-compose.yml content

version: "3.8"

#
# WARNING: Make sure to use the docker-compose.yml of the current release:
#
# https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml
#
# The compose file on main may not be compatible with the latest release.
#

name: immich

services:
  immich-server:
    container_name: immich_server
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    command: [ "start.sh", "immich" ]
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /etc/localtime:/etc/localtime:ro
      - ${EXTERNAL_LIBRARY}://mnt/media/external_library:ro
    env_file:
      - .env
    ports:
      - 2283:3001
    depends_on:
      - redis
      - database
    restart: always

  immich-microservices:
    container_name: immich_microservices
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    extends:
      file: hwaccel.yml
      service: hwaccel
    command: [ "start.sh", "microservices" ]
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /etc/localtime:/etc/localtime:ro
      - ${EXTERNAL_LIBRARY}://mnt/media/external_library:ro
    env_file:
      - .env
    depends_on:
      - redis
      - database
    restart: always

  immich-machine-learning:
    container_name: immich_machine_learning
    # For hardware acceleration, add one of -[armnn, cuda, openvino] to the image tag.
    # Example tag: ${IMMICH_VERSION:-release}-cuda
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}-openvino
    # image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}
    extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/ml-hardware-acceleration
      file: hwaccel.ml.yml
      service: openvino # set to one of [armnn, cuda, openvino, openvino-wsl] for accelerated inference - use the `-wsl` version for WSL2 where applicable
    volumes:
      - model-cache:/cache
    env_file:
      - .env
    restart: always

  redis:
    container_name: immich_redis
    image: redis:6.2-alpine@sha256:c5a607fb6e1bb15d32bbcf14db22787d19e428d59e31a5da67511b49bb0f1ccc
    restart: always

  database:
    container_name: immich_postgres
    image: tensorchord/pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0
    env_file:
      - .env
    environment:
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_USER: ${DB_USERNAME}
      POSTGRES_DB: ${DB_DATABASE_NAME}
    volumes:
      - pgdata:/var/lib/postgresql/data
    restart: always

volumes:
  pgdata:
  model-cache:

Your .env content

# You can find documentation for all the supported env variables at https://immich.app/docs/install/environment-variables

# The location where your uploaded files are stored
UPLOAD_LOCATION=/volume1/Docker/immich/upload_location

EXTERNAL_LIBRARY=/volume2/...

# The Immich version to use. You can pin this to a specific version like "v1.71.0"
IMMICH_VERSION=release

# Connection secret for postgres. You should change it to a random password
DB_PASSWORD=

# The values below this line do not need to be changed
###################################################################################
DB_HOSTNAME=immich_postgres
DB_USERNAME=postgres
DB_DATABASE_NAME=immich

REDIS_HOSTNAME=immich_redis

Reproduction steps

1. Enable the openvino machine learning image / container
2. Try to search something using the web interface

Additional information

No response

Mar 29 '24 09:03 fwmone

It seems that the hwaccel.ml.yml file version Out of date, try downloading the newest one from the release page and check accordingly.

Mar 29 '24 13:03 aviv926

Thanks! Yes, you were right, my config files were outdated - sorry for that. I updated all of them, but the problem still stays.

docker-compose.yml:

`version: "3.8"

WARNING: Make sure to use the docker-compose.yml of the current release:

https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml

The compose file on main may not be compatible with the latest release.

name: immich

services: immich-server: container_name: immich_server image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release} command: ['start.sh', 'immich'] volumes: - ${UPLOAD_LOCATION}:/usr/src/app/upload - /etc/localtime:/etc/localtime:ro - ${EXTERNAL_LIBRARY}://mnt/media/external_library:ro env_file: - .env ports: - 2283:3001 depends_on: - redis - database restart: always

immich-microservices: container_name: immich_microservices image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release} extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/hardware-transcoding file: hwaccel.transcoding.yml service: quicksync # set to one of [nvenc, quicksync, rkmpp, vaapi, vaapi-wsl] for accelerated transcoding command: ['start.sh', 'microservices'] volumes: - ${UPLOAD_LOCATION}:/usr/src/app/upload - /etc/localtime:/etc/localtime:ro - ${EXTERNAL_LIBRARY}://mnt/media/external_library:ro env_file: - .env depends_on: - redis - database restart: always

immich-machine-learning: container_name: immich_machine_learning # For hardware acceleration, add one of -[armnn, cuda, openvino] to the image tag. # Example tag: ${IMMICH_VERSION:-release}-cuda image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}-openvino # image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release} extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/ml-hardware-acceleration file: hwaccel.ml.yml service: openvino # set to one of [armnn, cuda, openvino, openvino-wsl] for accelerated inference - use the -wsl version for WSL2 where applicable volumes: - model-cache:/cache env_file: - .env restart: always

redis: container_name: immich_redis image: registry.hub.docker.com/library/redis:6.2-alpine@sha256:51d6c56749a4243096327e3fb964a48ed92254357108449cb6e23999c37773c5 restart: always

database: container_name: immich_postgres image: registry.hub.docker.com/tensorchord/pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0 env_file: - .env environment: POSTGRES_PASSWORD: ${DB_PASSWORD} POSTGRES_USER: ${DB_USERNAME} POSTGRES_DB: ${DB_DATABASE_NAME} volumes: - pgdata:/var/lib/postgresql/data restart: always

volumes: pgdata: model-cache: `

hwaccel.ml.yml:

`version: "3.8"

Configurations for hardware-accelerated machine learning

If using Unraid or another platform that doesn't allow multiple Compose files,

you can inline the config for a backend by copying its contents

into the immich-machine-learning service in the docker-compose.yml file.

See https://immich.app/docs/features/ml-hardware-acceleration for info on usage.

cpu: {}

cuda: deploy: resources: reservations: devices: - driver: nvidia count: 1 capabilities: - gpu - compute - video

openvino: device_cgroup_rules: - "c 189:* rmw" devices: - /dev/dri:/dev/dri volumes: - /dev/bus/usb:/dev/bus/usb

openvino-wsl: devices: - /dev/dri:/dev/dri - /dev/dxg:/dev/dxg volumes: - /dev/bus/usb:/dev/bus/usb - /usr/lib/wsl:/usr/lib/wsl `

Log:

`[03/29/24 16:57:34] INFO Starting gunicorn 21.2.0
[03/29/24 16:57:34] INFO Listening at: http://[::]:3003 (9)
[03/29/24 16:57:34] INFO Using worker: app.config.CustomUvicornWorker
[03/29/24 16:57:34] INFO Booting worker with pid: 13
[03/29/24 16:57:41] INFO Started server process [13]
[03/29/24 16:57:41] INFO Waiting for application startup.
[03/29/24 16:57:41] INFO Created in-memory cache with unloading after 300s
of inactivity.
[03/29/24 16:57:41] INFO Initialized request thread pool with 4 threads.
[03/29/24 16:57:41] INFO Application startup complete.
[03/29/24 16:59:38] INFO Setting 'ViT-B-32__openai' execution providers to
['OpenVINOExecutionProvider',
'CPUExecutionProvider'], in descending order of
preference
[03/29/24 16:59:38] INFO Loading clip model 'ViT-B-32__openai' to memory
2024-03-29 16:59:42.034430019 [E:onnxruntime:, inference_session.cc:1985 Initialize] Encountered unknown exception in Initialize() [03/29/24 16:59:42] ERROR Exception in ASGI application

                         ╭─────── Traceback (most recent call last) ───────╮
                         │ /usr/src/app/main.py:116 in predict             │
                         │                                                 │
                         │   113 │   except orjson.JSONDecodeError:        │
                         │   114 │   │   raise HTTPException(400, f"Invali │
                         │   115 │                                         │
                         │ ❱ 116 │   model = await load(await model_cache. │
                         │       ttl=settings.model_ttl, **kwargs))        │
                         │   117 │   model.configure(**kwargs)             │
                         │   118 │   outputs = await run(model.predict, in │
                         │   119 │   return ORJSONResponse(outputs)        │
                         │                                                 │
                         │ /usr/src/app/main.py:137 in load                │
                         │                                                 │
                         │   134 │   │   │   model.load()                  │
                         │   135 │                                         │
                         │   136 │   try:                                  │
                         │ ❱ 137 │   │   await run(_load, model)           │
                         │   138 │   │   return model                      │
                         │   139 │   except (OSError, InvalidProtobuf, Bad │
                         │   140 │   │   log.warning(                      │
                         │                                                 │
                         │ /usr/src/app/main.py:125 in run                 │
                         │                                                 │
                         │   122 async def run(func: Callable[..., Any], i │
                         │   123 │   if thread_pool is None:               │
                         │   124 │   │   return func(inputs)               │
                         │ ❱ 125 │   return await asyncio.get_running_loop │
                         │   126                                           │
                         │   127                                           │
                         │   128 async def load(model: InferenceModel) ->  │
                         │                                                 │
                         │ /usr/lib/python3.10/concurrent/futures/thread.p │
                         │ y:58 in run                                     │
                         │                                                 │
                         │ /usr/src/app/main.py:134 in _load               │
                         │                                                 │
                         │   131 │                                         │
                         │   132 │   def _load(model: InferenceModel) -> N │
                         │   133 │   │   with lock:                        │
                         │ ❱ 134 │   │   │   model.load()                  │
                         │   135 │                                         │
                         │   136 │   try:                                  │
                         │   137 │   │   await run(_load, model)           │
                         │                                                 │
                         │ /usr/src/app/models/base.py:52 in load          │
                         │                                                 │
                         │    49 │   │   │   return                        │
                         │    50 │   │   self.download()                   │
                         │    51 │   │   log.info(f"Loading {self.model_ty │
                         │       to memory")                               │
                         │ ❱  52 │   │   self._load()                      │
                         │    53 │   │   self.loaded = True                │
                         │    54 │                                         │
                         │    55 │   def predict(self, inputs: Any, **mode │
                         │                                                 │
                         │ /usr/src/app/models/clip.py:146 in _load        │
                         │                                                 │
                         │   143 │   │   super().__init__(clean_name(model │
                         │   144 │                                         │
                         │   145 │   def _load(self) -> None:              │
                         │ ❱ 146 │   │   super()._load()                   │
                         │   147 │   │   self._load_tokenizer()            │
                         │   148 │   │                                     │
                         │   149 │   │   size: list[int] | int = self.prep │
                         │                                                 │
                         │ /usr/src/app/models/clip.py:36 in _load         │
                         │                                                 │
                         │    33 │   def _load(self) -> None:              │
                         │    34 │   │   if self.mode == "text" or self.mo │
                         │    35 │   │   │   log.debug(f"Loading clip text │
                         │ ❱  36 │   │   │   self.text_model = self._make_ │
                         │    37 │   │   │   log.debug(f"Loaded clip text  │
                         │    38 │   │                                     │
                         │    39 │   │   if self.mode == "vision" or self. │
                         │                                                 │
                         │ /usr/src/app/models/base.py:117 in              │
                         │ _make_session                                   │
                         │                                                 │
                         │   114 │   │   │   case ".armnn":                │
                         │   115 │   │   │   │   session = AnnSession(mode │
                         │   116 │   │   │   case ".onnx":                 │
                         │ ❱ 117 │   │   │   │   session = ort.InferenceSe │
                         │   118 │   │   │   │   │   model_path.as_posix() │
                         │   119 │   │   │   │   │   sess_options=self.ses │
                         │   120 │   │   │   │   │   providers=self.provid │
                         │                                                 │
                         │ /opt/venv/lib/python3.10/site-packages/onnxrunt │
                         │ ime/capi/onnxruntime_inference_collection.py:41 │
                         │ 9 in __init__                                   │
                         │                                                 │
                         │    416 │   │   disabled_optimizers = kwargs["di │
                         │        kwargs else None                         │
                         │    417 │   │                                    │
                         │    418 │   │   try:                             │
                         │ ❱  419 │   │   │   self._create_inference_sessi │
                         │        disabled_optimizers)                     │
                         │    420 │   │   except (ValueError, RuntimeError │
                         │    421 │   │   │   if self._enable_fallback:    │
                         │    422 │   │   │   │   try:                     │
                         │                                                 │
                         │ /opt/venv/lib/python3.10/site-packages/onnxrunt │
                         │ ime/capi/onnxruntime_inference_collection.py:48 │
                         │ 3 in _create_inference_session                  │
                         │                                                 │
                         │    480 │   │   │   disabled_optimizers = set(di │
                         │    481 │   │                                    │
                         │    482 │   │   # initialize the C++ InferenceSe │
                         │ ❱  483 │   │   sess.initialize_session(provider │
                         │    484 │   │                                    │
                         │    485 │   │   self._sess = sess                │
                         │    486 │   │   self._sess_options = self._sess. │
                         ╰─────────────────────────────────────────────────╯
                         RuntimeException: [ONNXRuntimeError] : 6 :         
                         RUNTIME_EXCEPTION : Encountered unknown exception  
                         in Initialize()

[03/29/24 17:00:25] INFO Loading clip model 'ViT-B-32__openai' to memory
2024-03-29 17:00:28.560632989 [E:onnxruntime:, inference_session.cc:1985 Initialize] Encountered unknown exception in Initialize() [03/29/24 17:00:28] ERROR Exception in ASGI application

                         ╭─────── Traceback (most recent call last) ───────╮
                         │ /usr/src/app/main.py:116 in predict             │
                         │                                                 │
                         │   113 │   except orjson.JSONDecodeError:        │
                         │   114 │   │   raise HTTPException(400, f"Invali │
                         │   115 │                                         │
                         │ ❱ 116 │   model = await load(await model_cache. │
                         │       ttl=settings.model_ttl, **kwargs))        │
                         │   117 │   model.configure(**kwargs)             │
                         │   118 │   outputs = await run(model.predict, in │
                         │   119 │   return ORJSONResponse(outputs)        │
                         │                                                 │
                         │ /usr/src/app/main.py:137 in load                │
                         │                                                 │
                         │   134 │   │   │   model.load()                  │
                         │   135 │                                         │
                         │   136 │   try:                                  │
                         │ ❱ 137 │   │   await run(_load, model)           │
                         │   138 │   │   return model                      │
                         │   139 │   except (OSError, InvalidProtobuf, Bad │
                         │   140 │   │   log.warning(                      │
                         │                                                 │
                         │ /usr/src/app/main.py:125 in run                 │
                         │                                                 │
                         │   122 async def run(func: Callable[..., Any], i │
                         │   123 │   if thread_pool is None:               │
                         │   124 │   │   return func(inputs)               │
                         │ ❱ 125 │   return await asyncio.get_running_loop │
                         │   126                                           │
                         │   127                                           │
                         │   128 async def load(model: InferenceModel) ->  │
                         │                                                 │
                         │ /usr/lib/python3.10/concurrent/futures/thread.p │
                         │ y:58 in run                                     │
                         │                                                 │
                         │ /usr/src/app/main.py:134 in _load               │
                         │                                                 │
                         │   131 │                                         │
                         │   132 │   def _load(model: InferenceModel) -> N │
                         │   133 │   │   with lock:                        │
                         │ ❱ 134 │   │   │   model.load()                  │
                         │   135 │                                         │
                         │   136 │   try:                                  │
                         │   137 │   │   await run(_load, model)           │
                         │                                                 │
                         │ /usr/src/app/models/base.py:52 in load          │
                         │                                                 │
                         │    49 │   │   │   return                        │
                         │    50 │   │   self.download()                   │
                         │    51 │   │   log.info(f"Loading {self.model_ty │
                         │       to memory")                               │
                         │ ❱  52 │   │   self._load()                      │
                         │    53 │   │   self.loaded = True                │
                         │    54 │                                         │
                         │    55 │   def predict(self, inputs: Any, **mode │
                         │                                                 │
                         │ /usr/src/app/models/clip.py:146 in _load        │
                         │                                                 │
                         │   143 │   │   super().__init__(clean_name(model │
                         │   144 │                                         │
                         │   145 │   def _load(self) -> None:              │
                         │ ❱ 146 │   │   super()._load()                   │
                         │   147 │   │   self._load_tokenizer()            │
                         │   148 │   │                                     │
                         │   149 │   │   size: list[int] | int = self.prep │
                         │                                                 │
                         │ /usr/src/app/models/clip.py:36 in _load         │
                         │                                                 │
                         │    33 │   def _load(self) -> None:              │
                         │    34 │   │   if self.mode == "text" or self.mo │
                         │    35 │   │   │   log.debug(f"Loading clip text │
                         │ ❱  36 │   │   │   self.text_model = self._make_ │
                         │    37 │   │   │   log.debug(f"Loaded clip text  │
                         │    38 │   │                                     │
                         │    39 │   │   if self.mode == "vision" or self. │
                         │                                                 │
                         │ /usr/src/app/models/base.py:117 in              │
                         │ _make_session                                   │
                         │                                                 │
                         │   114 │   │   │   case ".armnn":                │
                         │   115 │   │   │   │   session = AnnSession(mode │
                         │   116 │   │   │   case ".onnx":                 │
                         │ ❱ 117 │   │   │   │   session = ort.InferenceSe │
                         │   118 │   │   │   │   │   model_path.as_posix() │
                         │   119 │   │   │   │   │   sess_options=self.ses │
                         │   120 │   │   │   │   │   providers=self.provid │
                         │                                                 │
                         │ /opt/venv/lib/python3.10/site-packages/onnxrunt │
                         │ ime/capi/onnxruntime_inference_collection.py:41 │
                         │ 9 in __init__                                   │
                         │                                                 │
                         │    416 │   │   disabled_optimizers = kwargs["di │
                         │        kwargs else None                         │
                         │    417 │   │                                    │
                         │    418 │   │   try:                             │
                         │ ❱  419 │   │   │   self._create_inference_sessi │
                         │        disabled_optimizers)                     │
                         │    420 │   │   except (ValueError, RuntimeError │
                         │    421 │   │   │   if self._enable_fallback:    │
                         │    422 │   │   │   │   try:                     │
                         │                                                 │
                         │ /opt/venv/lib/python3.10/site-packages/onnxrunt │
                         │ ime/capi/onnxruntime_inference_collection.py:48 │
                         │ 3 in _create_inference_session                  │
                         │                                                 │
                         │    480 │   │   │   disabled_optimizers = set(di │
                         │    481 │   │                                    │
                         │    482 │   │   # initialize the C++ InferenceSe │
                         │ ❱  483 │   │   sess.initialize_session(provider │
                         │    484 │   │                                    │
                         │    485 │   │   self._sess = sess                │
                         │    486 │   │   self._sess_options = self._sess. │
                         ╰─────────────────────────────────────────────────╯
                         RuntimeException: [ONNXRuntimeError] : 6 :         
                         RUNTIME_EXCEPTION : Encountered unknown exception  
                         in Initialize()                                    `

Mar 29 '24 17:03 fwmone

1.99 updated to a newer version of OpenVINO. This actually fixed smart search for most users from what I've seen, so interesting that it broke it for you. Unfortunately, there isn't much I can do here. It's surprisingly difficult to make OpenVINO work for everyone.

Apr 01 '24 02:04 mertalev

I am seeing this issue on my humble Intel J5005 (Gemini Lake). This is a fresh install as of yesterday evening

Installed and configured following the docs
Enabled OpenVino
Added a single RO external library

I have over 10GB unallocated gunicorn appears in intel_gpu_top but no usage

OpenVino inference is working nicely for Frigate NVR, but perhaps this model is too heavy for the iGPU? I'm happy to wipe/experiment with my installation if I can help in any way.

The log below is for a single face detection attempt (concurrency = 1)

immich_machine_learning logs

[04/23/24 10:13:26] INFO Starting gunicorn 22.0.0
[04/23/24 10:13:26] INFO Listening at: http://[::]:3003 (9)
[04/23/24 10:13:26] INFO Using worker: app.config.CustomUvicornWorker
[04/23/24 10:13:26] INFO Booting worker with pid: 13
[04/23/24 10:13:27] DEBUG Could not load ANN shared libraries, using ONNX:
libmali.so: cannot open shared object file: No such file or directory
[04/23/24 10:13:34] INFO Started server process [13]
[04/23/24 10:13:34] INFO Waiting for application startup.
[04/23/24 10:13:34] INFO Created in-memory cache with unloading after 300s
of inactivity.
[04/23/24 10:13:34] INFO Initialized request thread pool with 4 threads.
[04/23/24 10:13:34] DEBUG Checking for inactivity...
[04/23/24 10:13:34] INFO Application startup complete.
[04/23/24 10:13:44] DEBUG Checking for inactivity...
[04/23/24 10:13:52] DEBUG Available ORT providers: {'CPUExecutionProvider',
'OpenVINOExecutionProvider'}
[04/23/24 10:13:52] DEBUG Available OpenVINO devices: ['CPU', 'GPU']
[04/23/24 10:13:52] INFO Setting 'buffalo_l' execution providers to
['OpenVINOExecutionProvider',
'CPUExecutionProvider'], in descending order of
preference
[04/23/24 10:13:52] DEBUG Setting execution provider options to
[{'device_type': 'GPU_FP32', 'cache_dir':
'/cache/facial-recognition/buffalo_l/openvino'},
{'arena_extend_strategy': 'kSameAsRequested'}]
[04/23/24 10:13:52] DEBUG Setting execution_mode to ORT_SEQUENTIAL
[04/23/24 10:13:52] DEBUG Setting inter_op_num_threads to 0
[04/23/24 10:13:52] DEBUG Setting intra_op_num_threads to 0
[04/23/24 10:13:52] DEBUG Setting preferred runtime to onnx
[04/23/24 10:13:52] INFO Loading facial recognition model 'buffalo_l' to
memory
2024-04-23 10:13:53.433384806 [E:onnxruntime:, inference_session.cc:1985 Initialize] Encountered unknown exception in Initialize() [04/23/24 10:13:53] ERROR Exception in ASGI application

                         ╭─────── Traceback (most recent call last) ───────╮
                         │ /usr/src/app/main.py:116 in predict             │
                         │                                                 │
                         │   113 │   except orjson.JSONDecodeError:        │
                         │   114 │   │   raise HTTPException(400, f"Invali │
                         │   115 │                                         │
                         │ ❱ 116 │   model = await load(await model_cache. │
                         │       ttl=settings.model_ttl, **kwargs))        │
                         │   117 │   model.configure(**kwargs)             │
                         │   118 │   outputs = await run(model.predict, in │
                         │   119 │   return ORJSONResponse(outputs)        │
                         │                                                 │
                         │ ╭────────────────── locals ───────────────────╮ │
                         │ │      image = UploadFile(filename='blob',    │ │
                         │ │              size=491485,                   │ │
                         │ │              headers=Headers({'content-dis… │ │
                         │ │              'form-data; name="image";      │ │
                         │ │              filename="blob"',              │ │
                         │ │              'content-type':                │ │
                         │ │              'application/octet-stream'}))  │ │
                         │ │     inputs = b'\xff\xd8\xff\xe2\x01\xf0ICC… │ │
                         │ │              \x00\x00mntrRGB XYZ            │ │
                         │ │              \x07\xe2\x00\x03\x00\x14\x00\… │ │
                         │ │     kwargs = {                              │ │
                         │ │              │   'minScore': 0.7,           │ │
                         │ │              │   'maxDistance': 0.5,        │ │
                         │ │              │   'minFaces': 3              │ │
                         │ │              }                              │ │
                         │ │ model_name = 'buffalo_l'                    │ │
                         │ │ model_type = <ModelType.FACIAL_RECOGNITION: │ │
                         │ │              'facial-recognition'>          │ │
                         │ │    options = '{"minScore":0.7,"maxDistance… │ │
                         │ │       text = None                           │ │
                         │ ╰─────────────────────────────────────────────╯ │
                         │                                                 │
                         │ /usr/src/app/main.py:137 in load                │
                         │                                                 │
                         │   134 │   │   │   model.load()                  │
                         │   135 │                                         │
                         │   136 │   try:                                  │
                         │ ❱ 137 │   │   await run(_load, model)           │
                         │   138 │   │   return model                      │
                         │   139 │   except (OSError, InvalidProtobuf, Bad │
                         │   140 │   │   log.warning(                      │
                         │                                                 │
                         │ ╭────────────────── locals ───────────────────╮ │
                         │ │ _load = <function load.<locals>._load at    │ │
                         │ │         0x7f86807c0af0>                     │ │
                         │ │ model = <app.models.facial_recognition.Fac… │ │
                         │ │         object at 0x7f86850d7250>           │ │
                         │ ╰─────────────────────────────────────────────╯ │
                         │                                                 │
                         │ /usr/src/app/main.py:125 in run                 │
                         │                                                 │
                         │   122 async def run(func: Callable[..., Any], i │
                         │   123 │   if thread_pool is None:               │
                         │   124 │   │   return func(inputs)               │
                         │ ❱ 125 │   return await asyncio.get_running_loop │
                         │   126                                           │
                         │   127                                           │
                         │   128 async def load(model: InferenceModel) ->  │
                         │                                                 │
                         │ ╭────────────────── locals ───────────────────╮ │
                         │ │   func = <function load.<locals>._load at   │ │
                         │ │          0x7f86807c0af0>                    │ │
                         │ │ inputs = <app.models.facial_recognition.Fa… │ │
                         │ │          object at 0x7f86850d7250>          │ │
                         │ ╰─────────────────────────────────────────────╯ │
                         │                                                 │
                         │ /usr/lib/python3.10/concurrent/futures/thread.p │
                         │ y:58 in run                                     │
                         │                                                 │
                         │ /usr/src/app/main.py:134 in _load               │
                         │                                                 │
                         │   131 │                                         │
                         │   132 │   def _load(model: InferenceModel) -> N │
                         │   133 │   │   with lock:                        │
                         │ ❱ 134 │   │   │   model.load()                  │
                         │   135 │                                         │
                         │   136 │   try:                                  │
                         │   137 │   │   await run(_load, model)           │
                         │                                                 │
                         │ ╭────────────────── locals ───────────────────╮ │
                         │ │ model = <app.models.facial_recognition.Fac… │ │
                         │ │         object at 0x7f86850d7250>           │ │
                         │ ╰─────────────────────────────────────────────╯ │
                         │                                                 │
                         │ /usr/src/app/models/base.py:52 in load          │
                         │                                                 │
                         │    49 │   │   │   return                        │
                         │    50 │   │   self.download()                   │
                         │    51 │   │   log.info(f"Loading {self.model_ty │
                         │       to memory")                               │
                         │ ❱  52 │   │   self._load()                      │
                         │    53 │   │   self.loaded = True                │
                         │    54 │                                         │
                         │    55 │   def predict(self, inputs: Any, **mode │
                         │                                                 │
                         │ ╭────────────────── locals ───────────────────╮ │
                         │ │ self = <app.models.facial_recognition.Face… │ │
                         │ │        object at 0x7f86850d7250>            │ │
                         │ ╰─────────────────────────────────────────────╯ │
                         │                                                 │
                         │ /usr/src/app/models/facial_recognition.py:30 in │
                         │ _load                                           │
                         │                                                 │
                         │   27 │   │   super().__init__(clean_name(model_ │
                         │   28 │                                          │
                         │   29 │   def _load(self) -> None:               │
                         │ ❱ 30 │   │   self.det_model = RetinaFace(sessio │
                         │   31 │   │   self.rec_model = ArcFaceONNX(      │
                         │   32 │   │   │   self.rec_file.with_suffix(".on │
                         │   33 │   │   │   session=self._make_session(sel │
                         │                                                 │
                         │ ╭────────────────── locals ───────────────────╮ │
                         │ │ self = <app.models.facial_recognition.Face… │ │
                         │ │        object at 0x7f86850d7250>            │ │
                         │ ╰─────────────────────────────────────────────╯ │
                         │                                                 │
                         │ /usr/src/app/models/base.py:117 in              │
                         │ _make_session                                   │
                         │                                                 │
                         │   114 │   │   │   case ".armnn":                │
                         │   115 │   │   │   │   session = AnnSession(mode │
                         │   116 │   │   │   case ".onnx":                 │
                         │ ❱ 117 │   │   │   │   session = ort.InferenceSe │
                         │   118 │   │   │   │   │   model_path.as_posix() │
                         │   119 │   │   │   │   │   sess_options=self.ses │
                         │   120 │   │   │   │   │   providers=self.provid │
                         │                                                 │
                         │ ╭────────────────── locals ───────────────────╮ │
                         │ │ model_path = PosixPath('/cache/facial-reco… │ │
                         │ │       self = <app.models.facial_recognitio… │ │
                         │ │              object at 0x7f86850d7250>      │ │
                         │ ╰─────────────────────────────────────────────╯ │
                         │                                                 │
                         │ /opt/venv/lib/python3.10/site-packages/onnxrunt │
                         │ ime/capi/onnxruntime_inference_collection.py:41 │
                         │ 9 in __init__                                   │
                         │                                                 │
                         │    416 │   │   disabled_optimizers = kwargs["di │
                         │        kwargs else None                         │
                         │    417 │   │                                    │
                         │    418 │   │   try:                             │
                         │ ❱  419 │   │   │   self._create_inference_sessi │
                         │        disabled_optimizers)                     │
                         │    420 │   │   except (ValueError, RuntimeError │
                         │    421 │   │   │   if self._enable_fallback:    │
                         │    422 │   │   │   │   try:                     │
                         │                                                 │
                         │ ╭────────────────── locals ───────────────────╮ │
                         │ │ disabled_optimizers = None                  │ │
                         │ │              kwargs = {}                    │ │
                         │ │       path_or_bytes = '/cache/facial-recog… │ │
                         │ │    provider_options = [                     │ │
                         │ │                       │   {                 │ │
                         │ │                       │   │                 │ │
                         │ │                       'device_type':        │ │
                         │ │                       'GPU_FP32',           │ │
                         │ │                       │   │   'cache_dir':  │ │
                         │ │                       '/cache/facial-recog… │ │
                         │ │                       │   },                │ │
                         │ │                       │   {                 │ │
                         │ │                       │   │                 │ │
                         │ │                       'arena_extend_strate… │ │
                         │ │                       'kSameAsRequested'    │ │
                         │ │                       │   }                 │ │
                         │ │                       ]                     │ │
                         │ │           providers = [                     │ │
                         │ │                       │                     │ │
                         │ │                       'OpenVINOExecutionPr… │ │
                         │ │                       │                     │ │
                         │ │                       'CPUExecutionProvide… │ │
                         │ │                       ]                     │ │
                         │ │                self = <onnxruntime.capi.on… │ │
                         │ │                       object at             │ │
                         │ │                       0x7f86850d7d30>       │ │
                         │ │        sess_options = <onnxruntime.capi.on… │ │
                         │ │                       object at             │ │
                         │ │                       0x7f86807009f0>       │ │
                         │ ╰─────────────────────────────────────────────╯ │
                         │                                                 │
                         │ /opt/venv/lib/python3.10/site-packages/onnxrunt │
                         │ ime/capi/onnxruntime_inference_collection.py:48 │
                         │ 3 in _create_inference_session                  │
                         │                                                 │
                         │    480 │   │   │   disabled_optimizers = set(di │
                         │    481 │   │                                    │
                         │    482 │   │   # initialize the C++ InferenceSe │
                         │ ❱  483 │   │   sess.initialize_session(provider │
                         │    484 │   │                                    │
                         │    485 │   │   self._sess = sess                │
                         │    486 │   │   self._sess_options = self._sess. │
                         │                                                 │
                         │ ╭────────────────── locals ───────────────────╮ │
                         │ │ available_providers = [                     │ │
                         │ │                       │                     │ │
                         │ │                       'OpenVINOExecutionPr… │ │
                         │ │                       │                     │ │
                         │ │                       'CPUExecutionProvide… │ │
                         │ │                       ]                     │ │
                         │ │ disabled_optimizers = set()                 │ │
                         │ │    provider_options = [                     │ │
                         │ │                       │   {                 │ │
                         │ │                       │   │                 │ │
                         │ │                       'device_type':        │ │
                         │ │                       'GPU_FP32',           │ │
                         │ │                       │   │   'cache_dir':  │ │
                         │ │                       '/cache/facial-recog… │ │
                         │ │                       │   },                │ │
                         │ │                       │   {                 │ │
                         │ │                       │   │                 │ │
                         │ │                       'arena_extend_strate… │ │
                         │ │                       'kSameAsRequested'    │ │
                         │ │                       │   }                 │ │
                         │ │                       ]                     │ │
                         │ │           providers = [                     │ │
                         │ │                       │                     │ │
                         │ │                       'OpenVINOExecutionPr… │ │
                         │ │                       │                     │ │
                         │ │                       'CPUExecutionProvide… │ │
                         │ │                       ]                     │ │
                         │ │                self = <onnxruntime.capi.on… │ │
                         │ │                       object at             │ │
                         │ │                       0x7f86850d7d30>       │ │
                         │ │                sess = <onnxruntime.capi.on… │ │
                         │ │                       object at             │ │
                         │ │                       0x7f8680789d30>       │ │
                         │ │     session_options = <onnxruntime.capi.on… │ │
                         │ │                       object at             │ │
                         │ │                       0x7f86807009f0>       │ │
                         │ ╰─────────────────────────────────────────────╯ │
                         ╰─────────────────────────────────────────────────╯
                         RuntimeException: [ONNXRuntimeError] : 6 :         
                         RUNTIME_EXCEPTION : Encountered unknown exception  
                         in Initialize()

Apr 23 '24 10:04 reef-actor

I am on N5105 as well, with 16GB RAM. Can confirm having the exact same error when running 1.99.0 above. If switching to immich-machine-learning:v1.98.2-openvino everything works.

With MACHINE_LEARNING_PRELOAD__CLIP=ViT-B-32__openai, the error shows immediately after start:

immich-machine-learning logs

[04/24/24 20:57:02] INFO     Loading clip model 'ViT-B-32__openai' to memory    
2024-04-24 20:57:03.939642959 [E:onnxruntime:, inference_session.cc:1985 Initialize] Encountered unknown exception in Initialize()
[04/24/24 20:57:04] ERROR    Traceback (most recent call last):                 
                               File                                             
                             "/opt/venv/lib/python3.10/site-packages/starlette/r
                             outing.py", line 734, in lifespan                  
                                 async with self.lifespan_context(app) as       
                             maybe_state:                                       
                               File "/usr/lib/python3.10/contextlib.py", line   
                             199, in __aenter__                                 
                                 return await anext(self.gen)                   
                               File "/usr/src/app/main.py", line 55, in lifespan
                                 await preload_models(settings.preload)         
                               File "/usr/src/app/main.py", line 69, in         
                             preload_models                                     
                                 await load(await                               
                             model_cache.get(preload_models.clip,               
                             ModelType.CLIP))                                   
                               File "/usr/src/app/main.py", line 137, in load   
                                 await run(_load, model)                        
                               File "/usr/src/app/main.py", line 125, in run    
                                 return await                                   
                             asyncio.get_running_loop().run_in_executor(thread_p
                             ool, func, inputs)                                 
                               File                                             
                             "/usr/lib/python3.10/concurrent/futures/thread.py",
                             line 58, in run                                    
                                 result = self.fn(*self.args, **self.kwargs)    
                               File "/usr/src/app/main.py", line 134, in _load  
                                 model.load()                                   
                               File "/usr/src/app/models/base.py", line 52, in  
                             load                                               
                                 self._load()                                   
                               File "/usr/src/app/models/clip.py", line 146, in 
                             _load                                              
                                 super()._load()                                
                               File "/usr/src/app/models/clip.py", line 36, in  
                             _load                                              
                                 self.text_model =                              
                             self._make_session(self.textual_path)              
                               File "/usr/src/app/models/base.py", line 117, in 
                             _make_session                                      
                                 session = ort.InferenceSession(                
                               File                                             
                             "/opt/venv/lib/python3.10/site-packages/onnxruntime
                             /capi/onnxruntime_inference_collection.py", line   
                             419, in __init__                                   
                                 self._create_inference_session(providers,      
                             provider_options, disabled_optimizers)             
                               File                                             
                             "/opt/venv/lib/python3.10/site-packages/onnxruntime
                             /capi/onnxruntime_inference_collection.py", line   
                             483, in _create_inference_session                  
                                 sess.initialize_session(providers,             
                             provider_options, disabled_optimizers)             
                             onnxruntime.capi.onnxruntime_pybind11_state.Runtime
                             Exception: [ONNXRuntimeError] : 6 :                
                             RUNTIME_EXCEPTION : Encountered unknown exception  
                             in Initialize()

Apr 24 '24 20:04 omltcat

I'm getting the same error on a i5 6200u (gpu hd520), with Immich version 1.104.6, and since V1.100 (the first time I tried using the gpu for hw acceleration) Transcoding works fine.

It's running in a privileged LXC on the latest version of proxmox, 8gb of ram dedicated.

I'm not able, though, to try the 1.98.2 version of the machine learning container because it gives me an error on the server container:

server container error with 1.98.2 machine learning container

[Nest] 7 - 06/14/2024, 11:32:55 AM ERROR [Microservices:JobService] Unable to run job handler (faceDetection/face-detection): Error: Machine learning request '{"facial-recognition":{"detection":{"modelName":"buffalo_l","options":{"minScore":0.7}},"recognition":{"modelName":"buffalo_l"}}}' failed with status 422: Unprocessable Entity

[Nest] 7 - 06/14/2024, 11:32:55 AM ERROR [Microservices:JobService] Error: Machine learning request '{"facial-recognition":{"detection":{"modelName":"buffalo_l","options":{"minScore":0.7}},"recognition":{"modelName":"buffalo_l"}}}' failed with status 422: Unprocessable Entity

at MachineLearningRepository.predict (/usr/src/app/dist/repositories/machine-learning.repository.js:22:19)

at process.processTicksAndRejections (node:internal/process/task_queues:95:5)

at async MachineLearningRepository.detectFaces (/usr/src/app/dist/repositories/machine-learning.repository.js:33:26)

at async PersonService.handleDetectFaces (/usr/src/app/dist/services/person.service.js:274:52)

at async /usr/src/app/dist/services/job.service.js:148:36

at async Worker.processJob (/usr/src/app/node_modules/bullmq/dist/cjs/classes/worker.js:394:28)

at async Worker.retryIfFailed (/usr/src/app/node_modules/bullmq/dist/cjs/classes/worker.js:581:24)

[Nest] 7 - 06/14/2024, 11:32:55 AM ERROR [Microservices:JobService] Object:

{

"id": "2c44089e-a5ab-4b47-894e-21ad138371b0"

}

[Nest] 17 - 06/14/2024, 11:33:53 AM LOG [Api:EventRepository] Websocket Disconnect: yaBQqD1_Clm1qZRGAAAJ

Jun 14 '24 11:06 stefano99

This should be fixed as of the current release. Be sure to delete the model cache volume so it downloads the updated models.

Jul 27 '24 05:07 mertalev

Looks good to me - thanks a bunch!

Jul 27 '24 06:07 fwmone

immich immich copied to clipboard

Search not working when using OpenVINO

The bug

Configurations for hardware-accelerated machine learning

If using Unraid or another platform that doesn't allow multiple Compose files,

you can inline the config for a backend by copying its contents

into the immich-machine-learning service in the docker-compose.yml file.

See https://immich.app/docs/features/ml-hardware-acceleration for info on usage.

The OS that Immich Server is running on

Version of Immich Server

Version of Immich Mobile App

Platform with the issue

Your docker-compose.yml content

Your .env content

Reproduction steps

Additional information

WARNING: Make sure to use the docker-compose.yml of the current release:

https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml

The compose file on main may not be compatible with the latest release.

Configurations for hardware-accelerated machine learning

If using Unraid or another platform that doesn't allow multiple Compose files,

you can inline the config for a backend by copying its contents

into the immich-machine-learning service in the docker-compose.yml file.

See https://immich.app/docs/features/ml-hardware-acceleration for info on usage.

immich
immich copied to clipboard