Spurious face recognition (random photos of random things and people recognized as one face)
The bug
Immich recognized about 200 photos from completely different things, people, landscapes, foods, game screenshots (completely unrelated photos) as being one person in face recognition. See attached images below, it makes it very obvious to understand the issue.
And here are some of the photos that were associated with this "person":
As can be easily seen, the first is a dog with black background, the second is a screenshot from Genshin Impact, the third is a group of people on a grass, the forth is a completely unrelated group of people on the snow, the fifth is some birds in the rain and the last is some foods in a pan. 100% unrelated photos. It should not have bundled these photos as one person.
Any way I can "delete" this person?
The OS that Immich Server is running on
Raspberry Pi OS 64 bit latest version
Version of Immich Server
v1.113.1
Version of Immich Mobile App
N/A
Platform with the issue
- [X] Server
- [X] Web
- [X] Mobile
Your docker-compose.yml content
#
# WARNING: Make sure to use the docker-compose.yml of the current release:
#
# https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml
#
# The compose file on main may not be compatible with the latest release.
#
name: immich
services:
immich-server:
container_name: immich_server
image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
# extends:
# file: hwaccel.transcoding.yml
# service: cpu # set to one of [nvenc, quicksync, rkmpp, vaapi, vaapi-wsl] for accelerated transcoding
volumes:
# Do not edit the next line. If you want to change the media storage location on your system, edit the value of UPLOAD_LOCATION in the .env file
- ${UPLOAD_LOCATION}:/usr/src/app/upload
- /home/edu_adm/EGNAS2_shared_folder:/home/edu_adm/EGNAS2_shared_folder
- /etc/localtime:/etc/localtime:ro
env_file:
- .env
ports:
- 2283:3001
depends_on:
- redis
- database
restart: always
healthcheck:
disable: false
immich-machine-learning:
container_name: immich_machine_learning
# For hardware acceleration, add one of -[armnn, cuda, openvino] to the image tag.
# Example tag: ${IMMICH_VERSION:-release}-cuda
image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}
# extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/ml-hardware-acceleration
# file: hwaccel.ml.yml
# service: cpu # set to one of [armnn, cuda, openvino, openvino-wsl] for accelerated inference - use the `-wsl` version for WSL2 where applicable
volumes:
- model-cache:/cache
env_file:
- .env
restart: always
healthcheck:
disable: false
redis:
container_name: immich_redis
image: docker.io/redis:6.2-alpine@sha256:e3b17ba9479deec4b7d1eeec1548a253acc5374d68d3b27937fcfe4df8d18c7e
healthcheck:
test: redis-cli ping || exit 1
restart: always
database:
container_name: immich_postgres
image: docker.io/tensorchord/pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0
environment:
POSTGRES_PASSWORD: ${DB_PASSWORD}
POSTGRES_USER: ${DB_USERNAME}
POSTGRES_DB: ${DB_DATABASE_NAME}
POSTGRES_INITDB_ARGS: '--data-checksums'
volumes:
# Do not edit the next line. If you want to change the database storage location on your system, edit the value of DB_DATA_LOCATION in the .env file
- ${DB_DATA_LOCATION}:/var/lib/postgresql/data
healthcheck:
test: pg_isready --dbname='${DB_DATABASE_NAME}' --username='${DB_USERNAME}' || exit 1; Chksum="$$(psql --dbname='${DB_DATABASE_NAME}' --username='${DB_USERNAME}' --tuples-only --no-align --command='SELECT COALESCE(SUM(checksum_failures), 0) FROM pg_stat_database')"; echo "checksum failure count is $$Chksum"; > interval: 5m
start_interval: 30s
start_period: 5m
command: ["postgres", "-c", "shared_preload_libraries=vectors.so", "-c", 'search_path="$$user", public, vectors', "-c", "logging_collector=on", "-c", "max_wal_size=2GB", "-c", "shared_buffers=512MB", "-c", "wal_compression=on"]
restart: always
volumes:
model-cache:
Your .env content
# You can find documentation for all the supported env variables at https://immich.app/docs/install/environment-variables
# The location where your uploaded files are stored
UPLOAD_LOCATION=./library
# The location where your database files are stored
DB_DATA_LOCATION=./postgres
# To set a timezone, uncomment the next line and change Etc/UTC to a TZ identifier from this list: https://en.wikipedia.org/wiki/List_of_tz_database_time_zones#List
# TZ=Etc/UTC
# The Immich version to use. You can pin this to a specific version like "v1.71.0"
IMMICH_VERSION=release
# Connection secret for postgres. You should change it to a random password
# Please use only the characters `A-Za-z0-9`, without special characters or spaces
DB_PASSWORD=<my_pass_redacted_to_post_here_on_github!>
# The values below this line do not need to be changed
###################################################################################
DB_USERNAME=postgres
DB_DATABASE_NAME=immich
Reproduction steps
I think it might be difficult to reproduce, but once it happens it can be observed by:
- Go to "explore"
- Click on "view all" of "People"
- Scroll to find the affected "Person" (which is a spurious collection of Photos) ...
Relevant log output
No response
Additional information
If you need help to debug this I am available to jump on discord to share the screen/do what is needed to help : )
Have you changed any of your machine learning settings at all?
no, at least not that I know of... how can this be changed?
These are the settings there, I don't recall changing anything:
Those seem like just the defaults, so I really have no idea why it would have done this. @mertalev any ideas?
That's super weird. What do the bounding boxes for these faces (or "faces") look like? You can open an image's info panel and hover over the person to see it.
Did you change the thumbnail settings? Can you post it?
When I mouse over this "person", it detects the face of the genshin character:
.. or the "face" of the duck:
or a potato:
Did you change the thumbnail settings? Can you post it?
![]()
I think I have not changed that either, here it is:
Could you run a few SQL queries for me and share the output for each?
select * from pg_vector_index_stat;
select count(*) from asset_faces;
select count(*) from face_search;
with
embeddings as (
select "originalFileName", embedding
from
assets
inner join asset_faces
on assets.id = asset_faces."assetId"
inner join face_search
on asset_faces.id = face_search."faceId"
where
assets."originalFileName" in ('20220408091020.png', '20211030_115938.jpg', '20230306_123821.jpg')
)
select this."originalFileName" image1, other."originalFileName" image2, this.embedding <=> other.embedding distance
from embeddings this, embeddings other;
Could you run a few SQL queries for me and share the output for each?
select * from pg_vector_index_stat;select count(*) from asset_faces;select count(*) from face_search;with embeddings as ( select "originalFileName", embedding from assets inner join asset_faces on assets.id = asset_faces."assetId" inner join face_search on asset_faces.id = face_search."faceId" where assets."originalFileName" in ('20220408091020.png', '20211030_115938.jpg', '20230306_123821.jpg') ) select this."originalFileName" image1, other."originalFileName" image2, this.embedding <=> other.embedding distance from embeddings this, embeddings other;
can you help me how/where exactly I can do that?
You can run docker exec -it immich_postgres psql --dbname=immich --username=<DB_USERNAME> to connect to the database via the container directly, where <DB_USERNAME> is the value from your .env file. Then, you can just paste in a query and hit enter.
I have the same issue unfortunately. At least it happens just a couple of times in the people section so i just hide the 'fake' person detected
immich=# select * from pg_vector_index_stat;
tablerelid | indexrelid | tablename | indexname | idx_status | idx_indexing | idx_tuples | idx_sealed | idx_growing | idx_write | idx_size | idx_options
------------+------------+--------------+------------+------------+--------------+------------+------------+-------------+-----------+-----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
17319 | 17331 | smart_search | clip_index | NORMAL | t | 130483 | {130413} | {70} | 0 | 285602760 | {"vector":{"dimensions":512,"distance":"Cos","kind":"F32"},"segment":{"max_growing_segment_size":20000,"max_sealed_segment_size":1000000},"optimizing":{"sealing_secs":60,"sealing_size":1,"optimizing_threads":2},"indexing":{"hnsw":{"m":16,"ef_construction":300,"quantization":{"trivial":{}}}}}
17551 | 17575 | face_search | face_index | NORMAL | f | 116850 | {116752} | {} | 98 | 257665816 | {"vector":{"dimensions":512,"distance":"Cos","kind":"F32"},"segment":{"max_growing_segment_size":20000,"max_sealed_segment_size":1000000},"optimizing":{"sealing_secs":60,"sealing_size":1,"optimizing_threads":2},"indexing":{"hnsw":{"m":16,"ef_construction":300,"quantization":{"trivial":{}}}}}
(2 rows)
immich=# select count(*) from asset_faces; count
90480 (1 row)
immich=# select count(*) from face_search; count
90480 (1 row)
immich=# with embeddings as ( select "originalFileName", embedding from assets inner join asset_faces on assets.id = asset_faces."assetId" inner join face_search on asset_faces.id = face_search."faceId" where assets."originalFileName" in ('20220408091020.png', '20211030_115938.jpg', '20230306_123821.jpg') ) select this."originalFileName" image1, other."originalFileName" image2, this.embedding <=> other.embedding distance from embeddings this, embeddings other; image1 | image2 | distance ---------------------+---------------------+------------ 20211030_115938.jpg | 20211030_115938.jpg | 0 20211030_115938.jpg | 20220408091020.png | 0.7072995 20211030_115938.jpg | 20230306_123821.jpg | 0.68505836 20220408091020.png | 20211030_115938.jpg | 0.7072995 20220408091020.png | 20220408091020.png | 0 20220408091020.png | 20230306_123821.jpg | 0.7904459 20230306_123821.jpg | 20211030_115938.jpg | 0.68505836 20230306_123821.jpg | 20220408091020.png | 0.7904459 20230306_123821.jpg | 20230306_123821.jpg | 0 (9 rows)
@mertalev could you check the result of the queries ^?
I have the same issue, resulting in a single "person" with 800+ assets linked to it. I don't mind that it happens, no detection is flawless, but this feature request ( https://github.com/immich-app/immich/discussions/6559 ) would be nice so I can just delete it.
//edit: decided to create a simply Python script to do the cleanup.
This isn't really a bug as much as it is just a fundamental limitation of how machine learning works and the accuracy of the models that are available to us. The things we can do are make it easier to manage people, hide them, delete and recluster them, etc. These are feature requests and are tracked as GitHub Discussions.