immich
immich copied to clipboard
[BUG] Repair page not accessible
The bug
When I try to access the /admin/repair endpoint, I get this error:
In the logs it looks like this:
2023/10/17 09:18:54 [error] 45#45: *708 upstream timed out (110: Operation timed out) while reading response header from upstream, client: 192.168.3.102, server: , request: "GET /admin/repair/__data.json?x-sveltekit-invalidated=01 HTTP/1.1", upstream: "http://172.18.0.3:3000/admin/repair/__data.json?x-sveltekit-invalidated=01", host: "<removed>", referrer: "https://<removed>/admin/jobs-status"
2023/10/17 09:18:58 [error] 46#46: *723 upstream timed out (110: Operation timed out) while reading response header from upstream, client: 192.168.3.102, server: , request: "GET /admin/repair HTTP/1.1", upstream: "http://172.18.0.3:3000/admin/repair", host: "photos.zkr.io", referrer: "https://<removed>/admin/jobs-status"
The OS that Immich Server is running on
Proxmox Alpine Based LXC with docker
Version of Immich Server
v1.82.0
Version of Immich Mobile App
v1.82.0
Platform with the issue
- [ ] Server
- [X] Web
- [ ] Mobile
Your docker-compose.yml content
version: '3.8'
services:
immich-server:
image: ghcr.io/immich-app/immich-server:release
container_name: immich_server
restart: always
command: [ "start.sh", "immich" ]
volumes:
- /shared/nas/immich:/usr/src/app/upload
- /shared/nas/photos:/mnt/media/nas:ro
env_file:
- .env
depends_on:
- redis
- database
- typesense
immich-microservices:
image: ghcr.io/immich-app/immich-server:release
container_name: immich_microservices
restart: always
command: [ "start.sh", "microservices" ]
volumes:
- /shared/nas/immich:/usr/src/app/upload
- /shared/nas/photos:/mnt/media/nas:ro
env_file:
- .env
depends_on:
- redis
- database
- typesense
environment:
# - LOG_LEVEL=verbose
- TZ=Europe/Berlin
immich-machine-learning:
image: ghcr.io/immich-app/immich-machine-learning:release
container_name: immich_machine_learning
restart: always
volumes:
- model-cache:/cache
env_file:
- .env
immich-web:
container_name: immich_web
image: ghcr.io/immich-app/immich-web:release
restart: always
env_file:
- .env
typesense:
image: typesense/typesense:0.24.1
container_name: immich_typesense
restart: always
environment:
- TYPESENSE_API_KEY=${TYPESENSE_API_KEY}
- TYPESENSE_DATA_DIR=/data
# remove this to get debug messages
- GLOG_minloglevel=1
volumes:
- tsdata:/data
redis:
container_name: immich_redis
image: redis:6.2-alpine
restart: always
database:
image: postgres:14-alpine
container_name: immich_postgres
restart: always
volumes:
- ~/files/postgres:/var/lib/postgresql/data
env_file:
- .env
environment:
POSTGRES_PASSWORD: ${DB_PASSWORD}
POSTGRES_USER: ${DB_USERNAME}
POSTGRES_DB: ${DB_DATABASE_NAME}
immich-proxy:
image: ghcr.io/immich-app/immich-proxy:release
container_name: immich_proxy
restart: always
environment:
# Make sure these values get passed through from the env file
- IMMICH_SERVER_URL
- IMMICH_WEB_URL
ports:
- 80:8080
depends_on:
- immich-server
- immich-web
volumes:
model-cache:
tsdata:
Your .env content
# You can find documentation for all the supported env variables at https://immich.app/docs/install/environment-variables
# The location where your uploaded files are stored
UPLOAD_LOCATION=./library
# The Immich version to use. You can pin this to a specific version like "v1.71.0"
IMMICH_VERSION=release
# Connection secrets for postgres and typesense. You should change these to random passwords
TYPESENSE_API_KEY=<removed>
DB_PASSWORD=<removed>
# The values below this line do not need to be changed
###################################################################################
DB_HOSTNAME=immich_postgres
DB_USERNAME=postgres
DB_DATABASE_NAME=immich
REDIS_HOSTNAME=immich_redis
Reproduction steps
1. Update to v1.82.0
2. Go to repair page
Additional information
No response
I also have the exact same issue
Same here. I smell some hotfix in the air 🚀
me too
Can confirm getting the same issue
Just for the devs to consider - @alextran1502 mentioned that there will be a patch and some UI improvements. What I observed is that when I navigate to the repair tab, Immich starts to compare the filesystem and database, which makes the system unresponsive for some time. I get that error but the system still does not stop with its operations even when I navigate to an other tab. Maybe it can be optimized that the compare operations stop when somebody navigates away from the repair tab.
Could the repair report be similar to other jobs where it needs to be run before looking at the results?
@andrewgdunn Yeah we are planning that for the fix/enhancement
@alextran1502 Even after 1.82.1 it's still not fixed. It's even worse as whole Immich just crash and stack restart is needed.
@Pheggas we haven't implemented the fix for this yet. When it is, it will be mentioned in the release note
@alextran1502 Even after 1.82.1 it's still not fixed. It's even worse as whole Immich just crash and stack restart is needed.
that particular issue was not part of 1.82.1 according to the relase notes.
Ah, sorry for that.
I just updated to v1.82.1 and I have the same issue when accessing the repair page.
I just updated to v1.82.1 and I have the same issue when accessing the repair page.
Again (as already stated two comments above yours) a fix for this issue is not included in the update 1.82.1
I'm on version 1.83.0 and still have the issue as well.
Please read my comment 4 posts above yours (and some others also mention this). The fix is still not mentioned in the release notes, so it's not part of the 1.83.0 release.
Still present in v1.85.0 but its normal it not yet fix ;)
The issue will be closed once it is fixed
With the new container structure in v1.88.0 the repair page is loaded successfully, you just have to wait for it (for several minutes) to load :)
With the new container structure in v1.88.0 the repair page is loaded successfully, you just have to wait for it (for several minutes) to load :)
I still have this issue also on 1.88.2. The loading bar at the top is there, no timeout but also after 30 minutes no repair page. The stats of the docker container show activity for may minutes but then go to idle.
I was able to open "Repair" page but only using local LAN IP. just waited for couple of minutes (cirka 5/10min) I saw some errors in server container but page loaded.
[Nest] 7 - 11/22/2023, 10:28:36 AM ERROR [ExceptionsHandler] Connection terminated due to connection timeout
Error: Connection terminated due to connection timeout
at Connection.<anonymous> (/usr/src/app/node_modules/pg/lib/client.js:132:73)
at Object.onceWrapper (node:events:628:28)
at Connection.emit (node:events:514:28)
at Socket.<anonymous> (/usr/src/app/node_modules/pg/lib/connection.js:63:12)
at Socket.emit (node:events:514:28)
at TCP.<anonymous> (node:net:337:12)
[Nest] 7 - 11/22/2023, 10:28:36 AM ERROR [ExceptionsHandler] Connection terminated due to connection timeout
Error: Connection terminated due to connection timeout
at Connection.<anonymous> (/usr/src/app/node_modules/pg/lib/client.js:132:73)
at Object.onceWrapper (node:events:628:28)
at Connection.emit (node:events:514:28)
at Socket.<anonymous> (/usr/src/app/node_modules/pg/lib/connection.js:63:12)
at Socket.emit (node:events:514:28)
at TCP.<anonymous> (node:net:337:12)
Using Repair page with nginx proxy manager redirection returns error as before
Nginx and other proxies will still enforce a timeout, but directly hitting the server won't since it is not configured with any.
Nginx and other proxies will still enforce a timeout, but directly hitting the server won't since it is not configured with any.
I saw that too. Of course there should not be a timeout anymore but I still can't view the page even after multiple tries and waiting 30 minutes. I saw that the different containers "do" something cpu wise for many minutes but then they stop and go to cpu idle. 250k pictures on the server all on enterprise SSDs so that should not be a big problem performance wise. I hope this can be fixed with a job that pre-generates the report in the background and then displays it on the repair page.
I think we will eventually move to a (background) report, yes. But for now at least some people can view it 😅
Nginx and other proxies will still enforce a timeout, but directly hitting the server won't since it is not configured with any.
you can add the following to nginx config to increase the timeouts
keepalive_timeout 1d;
send_timeout 1d;
client_body_timeout 1d;
client_header_timeout 1d;
proxy_connect_timeout 1d;
proxy_read_timeout 1d;
proxy_send_timeout 1d; ```
For me, repair page crashes Chrome tab after 15 minutes of waiting. I tried to use --max_old_space_size=16000 --js-flags="--max-old-space-size=16000" to start chrome, but that didn't help.
I am running v1.89 as a Unraid Docker. I do have the same error. I can not go back to immich again. Do I need to restart the docker?
I also see a timeout error when accessing the repair page:
immich_server | [Nest] 7 - 12/14/2023, 6:39:32 PM ERROR [ExceptionsHandler] Connection terminated due to connection timeout
immich_server | Error: Connection terminated due to connection timeout
immich_server | at Connection.<anonymous> (/usr/src/app/node_modules/pg/lib/client.js:132:73)
immich_server | at Object.onceWrapper (node:events:628:28)
immich_server | at Connection.emit (node:events:514:28)
immich_server | at Socket.<anonymous> (/usr/src/app/node_modules/pg/lib/connection.js:63:12)
immich_server | at Socket.emit (node:events:514:28)
immich_server | at TCP.<anonymous> (node:net:337:12)
Reloading the page does not help. But i can go back to the immich homepage and go to the admin panel to repair again. But i have the same issue then.
Sometimes it works. Then i see that there are a lot of untracked files, mostly thumbnails and encoded videos, which might be untracked because i move some files in my external library around. I would like to clean this up anyhow. But the repair all button is greyed out.
Is there a way to run this report via the command line?
No, but maybe I'll work on fixing it soon (tm), just for you :smile:
I have around 333k photos and for me the repair takes 5 minutes to run server-side and then the browser crashes. At some point once the server-side is done loading, memory use on the browser explodes from almost nothing to several GB within seconds and then the page gets killed. I've tried Safari, Edge and Firefox. Because of this, I have no way of executing the "repair all" function.
Same issue here. Can't access the page and it crashes the server.