immich Getting error in ASGI failed to allocate memory

The bug

I am just looking at my logs because of an issue I am having with facial recognition, these errors are unrelated as they happened during the night, but I wanted to draw some attention to them in case it is an issue.

I see the memory error, however, this machine is only currently using 3GB of the 8GB allocated to it.

The OS that Immich Server is running on

proxmox

Version of Immich Server

v1.115.0

Version of Immich Mobile App

NA

Platform with the issue

[X] Server
[ ] Web
[ ] Mobile

Your docker-compose.yml content

name: immich

services:
  immich-server:
    container_name: immich_server
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    extends:
      file: hwaccel.transcoding.yml
      service: quicksync # set to one of [nvenc, quicksync, rkmpp, vaapi, vaapi-wsl] for accelerated transcoding
    volumes:
      # Do not edit the next line. If you want to change the media storage location on your system, edit the value of UPLOAD_LOCATION in the .env file
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /etc/localtime:/etc/localtime:ro
      - /storage/dropbox:/dropbox
    env_file:
      - .env
    ports:
      - 2283:3001
    depends_on:
      - redis
      - database
    restart: unless-stopped

  immich-machine-learning:
    container_name: immich_machine_learning
    # For hardware acceleration, add one of -[armnn, cuda, openvino] to the image tag.
    # Example tag: ${IMMICH_VERSION:-release}-cuda
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}-cuda
    extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/ml-hardware-acceleration
      file: hwaccel.ml.yml
      service: cuda # set to one of [armnn, cuda, openvino, openvino-wsl] for accelerated inference - use the `-wsl` version for WSL2 where applicable
    volumes:
      - model-cache:/cache
    env_file:
      - .env
    restart: unless-stopped

  redis:
    container_name: immich_redis
    image: docker.io/redis:6.2-alpine@sha256:2d1463258f2764328496376f5d965f20c6a67f66ea2b06dc42af351f75248792
    restart: unless-stopped

  database:
    container_name: immich_postgres
    image: docker.io/tensorchord/pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0
    environment:
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_USER: ${DB_USERNAME}
      POSTGRES_DB: ${DB_DATABASE_NAME}
      POSTGRES_INITDB_ARGS: '--data-checksums'
    volumes:
      # Do not edit the next line. If you want to change the database storage location on your system, edit the value of DB_DATA_LOCATION in the .env file
      - ${DB_DATA_LOCATION}:/var/lib/postgresql/data
    command: ["postgres", "-c", "shared_preload_libraries=vectors.so", "-c", 'search_path="$$user", public, vectors', "-c", "logging_collector=on", "-c", "max_wal_size=2GB", "-c", "shared_buffers=512MB", "-c", "wal_compression=on"]
    restart: unless-stopped

volumes:
  model-cache:

Your .env content

# The location where your uploaded files are stored
UPLOAD_LOCATION=/storage/dropbox/Pictures
# The location where your database files are stored
DB_DATA_LOCATION=/storage/appdata/immich

# To set a timezone, uncomment the next line and change Etc/UTC to a TZ identifier from this list: https://en.wikipedia.org/wiki/List_of_tz_database_time_zones#List
TZ=America/Los_Angeles

# The Immich version to use. You can pin this to a specific version like "v1.71.0"
IMMICH_VERSION=release

Reproduction steps

None, just an error callstack.

Relevant log output

[09/25/24 13:05:35] ERROR    Exception in ASGI application                      
                                                                                
                             ╭─────── Traceback (most recent call last) ───────╮
                             │ /usr/src/app/main.py:152 in predict             │
                             │                                                 │
                             │   149 │   │   inputs = text                     │
                             │   150 │   else:                                 │
                             │   151 │   │   raise HTTPException(400, "Either  │
                             │ ❱ 152 │   response = await run_inference(inputs │
                             │   153 │   return ORJSONResponse(response)       │
                             │   154                                           │
                             │   155                                           │
                             │                                                 │
                             │ /usr/src/app/main.py:177 in run_inference       │
                             │                                                 │
                             │   174 │   without_deps, with_deps = entries     │
                             │   175 │   await asyncio.gather(*[_run_inference │
                             │   176 │   if with_deps:                         │
                             │ ❱ 177 │   │   await asyncio.gather(*[_run_infer │
                             │   178 │   if isinstance(payload, Image):        │
                             │   179 │   │   response["imageHeight"], response │
                             │   180                                           │
                             │                                                 │
                             │ /usr/src/app/main.py:170 in _run_inference      │
                             │                                                 │
                             │   167 │   │   │   │   message = f"Task {entry[' │
                             │       output of {dep}"                          │
                             │   168 │   │   │   │   raise HTTPException(400,  │
                             │   169 │   │   model = await load(model)         │
                             │ ❱ 170 │   │   output = await run(model.predict, │
                             │   171 │   │   outputs[model.identity] = output  │
                             │   172 │   │   response[entry["task"]] = output  │
                             │   173                                           │
                             │                                                 │
                             │ /usr/src/app/main.py:188 in run                 │
                             │                                                 │
                             │   185 │   if thread_pool is None:               │
                             │   186 │   │   return func(*args, **kwargs)      │
                             │   187 │   partial_func = partial(func, *args, * │
                             │ ❱ 188 │   return await asyncio.get_running_loop │
                             │   189                                           │
                             │   190                                           │
                             │   191 async def load(model: InferenceModel) ->  │
                             │                                                 │
                             │ /usr/local/lib/python3.11/concurrent/futures/th │
                             │ read.py:58 in run                               │
                             │                                                 │
                             │ /usr/src/app/models/base.py:60 in predict       │
                             │                                                 │
                             │    57 │   │   self.load()                       │
                             │    58 │   │   if model_kwargs:                  │
                             │    59 │   │   │   self.configure(**model_kwargs │
                             │ ❱  60 │   │   return self._predict(*inputs, **m │
                             │    61 │                                         │
                             │    62 │   @abstractmethod                       │
                             │    63 │   def _predict(self, *inputs: Any, **mo │
                             │                                                 │
                             │ /usr/src/app/models/facial_recognition/recognit │
                             │ ion.py:45 in _predict                           │
                             │                                                 │
                             │   42 │   │   │   return []                      │
                             │   43 │   │   inputs = decode_cv2(inputs)        │
                             │   44 │   │   cropped_faces = self._crop(inputs, │
                             │ ❱ 45 │   │   embeddings = self._predict_batch(c │
                             │      self._predict_single(cropped_faces)        │
                             │   46 │   │   return self.postprocess(faces, emb │
                             │   47 │                                          │
                             │   48 │   def _predict_batch(self, cropped_faces │
                             │      NDArray[np.float32]:                       │
                             │                                                 │
                             │ /usr/src/app/models/facial_recognition/recognit │
                             │ ion.py:49 in _predict_batch                     │
                             │                                                 │
                             │   46 │   │   return self.postprocess(faces, emb │
                             │   47 │                                          │
                             │   48 │   def _predict_batch(self, cropped_faces │
                             │      NDArray[np.float32]:                       │
                             │ ❱ 49 │   │   embeddings: NDArray[np.float32] =  │
                             │   50 │   │   return embeddings                  │
                             │   51 │                                          │
                             │   52 │   def _predict_single(self, cropped_face │
                             │      NDArray[np.float32]:                       │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/insightf │
                             │ ace/model_zoo/arcface_onnx.py:84 in get_feat    │
                             │                                                 │
                             │   81 │   │                                      │
                             │   82 │   │   blob = cv2.dnn.blobFromImages(imgs │
                             │   83 │   │   │   │   │   │   │   │   │     (sel │
                             │      self.input_mean), swapRB=True)             │
                             │ ❱ 84 │   │   net_out = self.session.run(self.ou │
                             │   85 │   │   return net_out                     │
                             │   86 │                                          │
                             │   87 │   def forward(self, batch_data):         │
                             │                                                 │
                             │ /usr/src/app/sessions/ort.py:49 in run          │
                             │                                                 │
                             │    46 │   │   input_feed: dict[str, NDArray[np. │
                             │    47 │   │   run_options: Any = None,          │
                             │    48 │   ) -> list[NDArray[np.float32]]:       │
                             │ ❱  49 │   │   outputs: list[NDArray[np.float32] │
                             │       run_options)                              │
                             │    50 │   │   return outputs                    │
                             │    51 │                                         │
                             │    52 │   @property                             │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:22 │
                             │ 0 in run                                        │
                             │                                                 │
                             │    217 │   │   if not output_names:             │
                             │    218 │   │   │   output_names = [output.name  │
                             │    219 │   │   try:                             │
                             │ ❱  220 │   │   │   return self._sess.run(output │
                             │    221 │   │   except C.EPFail as err:          │
                             │    222 │   │   │   if self._enable_fallback:    │
                             │    223 │   │   │   │   print(f"EP Error: {err!s │
                             ╰─────────────────────────────────────────────────╯
                             RuntimeException: [ONNXRuntimeError] : 6 :         
                             RUNTIME_EXCEPTION : Non-zero status code returned  
                             while running Conv node. Name:'Conv_85' Status     
                             Message:                                           
                             /onnxruntime_src/onnxruntime/core/framework/bfc_are
                             na.cc:376 void*                                    
                             onnxruntime::BFCArena::AllocateRawInternal(size_t, 
                             bool, onnxruntime::Stream*, bool,                  
                             onnxruntime::WaitNotificationFn) Failed to allocate
                             memory for requested buffer of size 308674560      
                                                                                
2024-09-25 13:05:38.937186818 [E:onnxruntime:, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_60' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 168820736

Additional information

No response

Sep 25 '24 16:09 rayzorben

Is this running in a LXC? Docker in LXC is not supported so we might need to be able to replicate this in a VM to investigate further

Sep 25 '24 16:09 mmomjian

Yes, this is docker in LXC. I get 'not supported' but I run 27 containers in docker in LXC with no issues including Frigate which also uses ML and deals with video and images?

Sep 25 '24 16:09 rayzorben

There’s probably a way to make it work, but LXC provides a whole other level of bugs and config issues that we just can’t reasonably handle. I’ll leave the issue open for now in case someone else sees a way to fix it or disagrees.

Sep 25 '24 16:09 mmomjian

Is there a supported Immich LXC that I can use instead?

Sep 25 '24 16:09 rayzorben

No, we don’t officially support LXC

Sep 25 '24 16:09 mmomjian

Any luck with this @rayzorben ? I'm having the same issue and I have no idea how can I run this on an LXC without running docker inside and passthrough the GPU. Except a VM but that's a bit overkill... Did you go with a VM?

Jan 04 '25 01:01 signorecello

Just FYI this is saying it couldn't allocate the VRAM it requested. If your GPU doesn't have enough free VRAM, this can also be the cause of the error.

Jan 04 '25 02:01 mertalev

Yes this happens when I free all other processes too, so I believe this is some incompatibility. I'll try with a VM maybe

Jan 04 '25 23:01 signorecello

Hello, some news? I have the same probem

Jan 16 '25 12:01 wes1993

immich immich copied to clipboard

Getting error in ASGI failed to allocate memory

The bug

The OS that Immich Server is running on

Version of Immich Server

Version of Immich Mobile App

Platform with the issue

Your docker-compose.yml content

Your .env content

Reproduction steps

Relevant log output

Additional information

immich
immich copied to clipboard