immich
immich copied to clipboard
Getting error in ASGI failed to allocate memory
The bug
I am just looking at my logs because of an issue I am having with facial recognition, these errors are unrelated as they happened during the night, but I wanted to draw some attention to them in case it is an issue.
I see the memory error, however, this machine is only currently using 3GB of the 8GB allocated to it.
The OS that Immich Server is running on
proxmox
Version of Immich Server
v1.115.0
Version of Immich Mobile App
NA
Platform with the issue
- [X] Server
- [ ] Web
- [ ] Mobile
Your docker-compose.yml content
name: immich
services:
immich-server:
container_name: immich_server
image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
extends:
file: hwaccel.transcoding.yml
service: quicksync # set to one of [nvenc, quicksync, rkmpp, vaapi, vaapi-wsl] for accelerated transcoding
volumes:
# Do not edit the next line. If you want to change the media storage location on your system, edit the value of UPLOAD_LOCATION in the .env file
- ${UPLOAD_LOCATION}:/usr/src/app/upload
- /etc/localtime:/etc/localtime:ro
- /storage/dropbox:/dropbox
env_file:
- .env
ports:
- 2283:3001
depends_on:
- redis
- database
restart: unless-stopped
immich-machine-learning:
container_name: immich_machine_learning
# For hardware acceleration, add one of -[armnn, cuda, openvino] to the image tag.
# Example tag: ${IMMICH_VERSION:-release}-cuda
image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}-cuda
extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/ml-hardware-acceleration
file: hwaccel.ml.yml
service: cuda # set to one of [armnn, cuda, openvino, openvino-wsl] for accelerated inference - use the `-wsl` version for WSL2 where applicable
volumes:
- model-cache:/cache
env_file:
- .env
restart: unless-stopped
redis:
container_name: immich_redis
image: docker.io/redis:6.2-alpine@sha256:2d1463258f2764328496376f5d965f20c6a67f66ea2b06dc42af351f75248792
restart: unless-stopped
database:
container_name: immich_postgres
image: docker.io/tensorchord/pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0
environment:
POSTGRES_PASSWORD: ${DB_PASSWORD}
POSTGRES_USER: ${DB_USERNAME}
POSTGRES_DB: ${DB_DATABASE_NAME}
POSTGRES_INITDB_ARGS: '--data-checksums'
volumes:
# Do not edit the next line. If you want to change the database storage location on your system, edit the value of DB_DATA_LOCATION in the .env file
- ${DB_DATA_LOCATION}:/var/lib/postgresql/data
command: ["postgres", "-c", "shared_preload_libraries=vectors.so", "-c", 'search_path="$$user", public, vectors', "-c", "logging_collector=on", "-c", "max_wal_size=2GB", "-c", "shared_buffers=512MB", "-c", "wal_compression=on"]
restart: unless-stopped
volumes:
model-cache:
Your .env content
# The location where your uploaded files are stored
UPLOAD_LOCATION=/storage/dropbox/Pictures
# The location where your database files are stored
DB_DATA_LOCATION=/storage/appdata/immich
# To set a timezone, uncomment the next line and change Etc/UTC to a TZ identifier from this list: https://en.wikipedia.org/wiki/List_of_tz_database_time_zones#List
TZ=America/Los_Angeles
# The Immich version to use. You can pin this to a specific version like "v1.71.0"
IMMICH_VERSION=release
Reproduction steps
None, just an error callstack.
Relevant log output
[09/25/24 13:05:35] ERROR Exception in ASGI application
╭─────── Traceback (most recent call last) ───────╮
│ /usr/src/app/main.py:152 in predict │
│ │
│ 149 │ │ inputs = text │
│ 150 │ else: │
│ 151 │ │ raise HTTPException(400, "Either │
│ ❱ 152 │ response = await run_inference(inputs │
│ 153 │ return ORJSONResponse(response) │
│ 154 │
│ 155 │
│ │
│ /usr/src/app/main.py:177 in run_inference │
│ │
│ 174 │ without_deps, with_deps = entries │
│ 175 │ await asyncio.gather(*[_run_inference │
│ 176 │ if with_deps: │
│ ❱ 177 │ │ await asyncio.gather(*[_run_infer │
│ 178 │ if isinstance(payload, Image): │
│ 179 │ │ response["imageHeight"], response │
│ 180 │
│ │
│ /usr/src/app/main.py:170 in _run_inference │
│ │
│ 167 │ │ │ │ message = f"Task {entry[' │
│ output of {dep}" │
│ 168 │ │ │ │ raise HTTPException(400, │
│ 169 │ │ model = await load(model) │
│ ❱ 170 │ │ output = await run(model.predict, │
│ 171 │ │ outputs[model.identity] = output │
│ 172 │ │ response[entry["task"]] = output │
│ 173 │
│ │
│ /usr/src/app/main.py:188 in run │
│ │
│ 185 │ if thread_pool is None: │
│ 186 │ │ return func(*args, **kwargs) │
│ 187 │ partial_func = partial(func, *args, * │
│ ❱ 188 │ return await asyncio.get_running_loop │
│ 189 │
│ 190 │
│ 191 async def load(model: InferenceModel) -> │
│ │
│ /usr/local/lib/python3.11/concurrent/futures/th │
│ read.py:58 in run │
│ │
│ /usr/src/app/models/base.py:60 in predict │
│ │
│ 57 │ │ self.load() │
│ 58 │ │ if model_kwargs: │
│ 59 │ │ │ self.configure(**model_kwargs │
│ ❱ 60 │ │ return self._predict(*inputs, **m │
│ 61 │ │
│ 62 │ @abstractmethod │
│ 63 │ def _predict(self, *inputs: Any, **mo │
│ │
│ /usr/src/app/models/facial_recognition/recognit │
│ ion.py:45 in _predict │
│ │
│ 42 │ │ │ return [] │
│ 43 │ │ inputs = decode_cv2(inputs) │
│ 44 │ │ cropped_faces = self._crop(inputs, │
│ ❱ 45 │ │ embeddings = self._predict_batch(c │
│ self._predict_single(cropped_faces) │
│ 46 │ │ return self.postprocess(faces, emb │
│ 47 │ │
│ 48 │ def _predict_batch(self, cropped_faces │
│ NDArray[np.float32]: │
│ │
│ /usr/src/app/models/facial_recognition/recognit │
│ ion.py:49 in _predict_batch │
│ │
│ 46 │ │ return self.postprocess(faces, emb │
│ 47 │ │
│ 48 │ def _predict_batch(self, cropped_faces │
│ NDArray[np.float32]: │
│ ❱ 49 │ │ embeddings: NDArray[np.float32] = │
│ 50 │ │ return embeddings │
│ 51 │ │
│ 52 │ def _predict_single(self, cropped_face │
│ NDArray[np.float32]: │
│ │
│ /opt/venv/lib/python3.11/site-packages/insightf │
│ ace/model_zoo/arcface_onnx.py:84 in get_feat │
│ │
│ 81 │ │ │
│ 82 │ │ blob = cv2.dnn.blobFromImages(imgs │
│ 83 │ │ │ │ │ │ │ │ │ (sel │
│ self.input_mean), swapRB=True) │
│ ❱ 84 │ │ net_out = self.session.run(self.ou │
│ 85 │ │ return net_out │
│ 86 │ │
│ 87 │ def forward(self, batch_data): │
│ │
│ /usr/src/app/sessions/ort.py:49 in run │
│ │
│ 46 │ │ input_feed: dict[str, NDArray[np. │
│ 47 │ │ run_options: Any = None, │
│ 48 │ ) -> list[NDArray[np.float32]]: │
│ ❱ 49 │ │ outputs: list[NDArray[np.float32] │
│ run_options) │
│ 50 │ │ return outputs │
│ 51 │ │
│ 52 │ @property │
│ │
│ /opt/venv/lib/python3.11/site-packages/onnxrunt │
│ ime/capi/onnxruntime_inference_collection.py:22 │
│ 0 in run │
│ │
│ 217 │ │ if not output_names: │
│ 218 │ │ │ output_names = [output.name │
│ 219 │ │ try: │
│ ❱ 220 │ │ │ return self._sess.run(output │
│ 221 │ │ except C.EPFail as err: │
│ 222 │ │ │ if self._enable_fallback: │
│ 223 │ │ │ │ print(f"EP Error: {err!s │
╰─────────────────────────────────────────────────╯
RuntimeException: [ONNXRuntimeError] : 6 :
RUNTIME_EXCEPTION : Non-zero status code returned
while running Conv node. Name:'Conv_85' Status
Message:
/onnxruntime_src/onnxruntime/core/framework/bfc_are
na.cc:376 void*
onnxruntime::BFCArena::AllocateRawInternal(size_t,
bool, onnxruntime::Stream*, bool,
onnxruntime::WaitNotificationFn) Failed to allocate
memory for requested buffer of size 308674560
2024-09-25 13:05:38.937186818 [E:onnxruntime:, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_60' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 168820736
Additional information
No response
Is this running in a LXC? Docker in LXC is not supported so we might need to be able to replicate this in a VM to investigate further
Yes, this is docker in LXC. I get 'not supported' but I run 27 containers in docker in LXC with no issues including Frigate which also uses ML and deals with video and images?
There’s probably a way to make it work, but LXC provides a whole other level of bugs and config issues that we just can’t reasonably handle. I’ll leave the issue open for now in case someone else sees a way to fix it or disagrees.
Is there a supported Immich LXC that I can use instead?
No, we don’t officially support LXC
Any luck with this @rayzorben ? I'm having the same issue and I have no idea how can I run this on an LXC without running docker inside and passthrough the GPU. Except a VM but that's a bit overkill... Did you go with a VM?
Just FYI this is saying it couldn't allocate the VRAM it requested. If your GPU doesn't have enough free VRAM, this can also be the cause of the error.
Yes this happens when I free all other processes too, so I believe this is some incompatibility. I'll try with a VM maybe
Hello, some news? I have the same probem