FaceDB Multiple images per person

Hello, As far as I understand the package does not allow adding multiple images of the same person, the update function just replaces the embeeding. Do you consider adding this feature? Or can you give me some suggestions/pitfalls to avoid/direction how should I implement it?

Apr 08 '24 13:04 stanbar

Try with check_similar=False.

ids, failed_indexes = db.add_many(
    imgs=imgs,
    names=names,
    check_similar=False
)

Apr 09 '24 07:04 shhossain

@shhossain I have a folder of images containing multiple faces in each image so how can i extract only unique faces and save them in another folder using this facedb. i did this using deepface and results are not satisfactory but fast, i also tried with face-recognition library which gave me better results but took a lot of time. Here's want I wanted, given a folder of images, we have to extract to faces in each image and check if that face is in unique_faces folder [db] if yes we have to skip it otherwise we have to add it as unique_face to the folder. I'm stuck at the moment i need both accuracy and performance. I think facedb will help me in accomplishing the above task. all i need is faster comparision of face data.

Apr 16 '24 10:04 Raghucharan16

Try with check_similar=False.

ids, failed_indexes = db.add_many(
    imgs=imgs,
    names=names,
    check_similar=False
)

Well, it's not necessary what I need, but thank you anyway. I would like to create a similarity search based on all pictures of a person. Kind of create an "average" embeeding of a person based on all their pictures and then search for similarities between people based on the average embeedings. Something like this: https://learn.deeplearning.ai/courses/building-applications-vector-databases/lesson/6/facial-similarity-search

I guess I would have to compare all embeddings of person1 agains all embeddings of person2 and compute the average distance.

Apr 16 '24 12:04 stanbar

@shhossain I have a folder of images containing multiple faces in each image so how can i extract only unique faces and save them in another folder using this facedb. i did this using deepface and results are not satisfactory but fast, i also tried with face-recognition library which gave me better results but took a lot of time. Here's want I wanted, given a folder of images, we have to extract to faces in each image and check if that face is in unique_faces folder [db] if yes we have to skip it otherwise we have to add it as unique_face to the folder. I'm stuck at the moment i need both accuracy and performance. I think facedb will help me in accomplishing the above task. all i need is faster comparision of face data.

I copy paste your question to ChatGPT, here is the answer:

To achieve your goal of extracting unique faces from a folder of images and ensuring each face is only saved once, using the FaceDB library, you can follow this script. This script assumes that you have a setup for either face_recognition or DeepFace as described in the FaceDB documentation and that you are using a filesystem-based storage for simplicity.

Here is a Python script that will:

Load all images from a specified directory.
Extract faces from each image.
Check if each extracted face is already in a "unique_faces" folder/database.
If not already present, save the new unique face.

from facedb import FaceDB
from glob import glob
import os
from PIL import Image
import numpy as np
import cv2

# Initialize the FaceDB
db = FaceDB(path="unique_faces", module='face_recognition')  # Use 'module' parameter as per your choice

# Load images from folder
image_folder = 'path_to_images_folder'  # Update with your folder path
images = glob(os.path.join(image_folder, '*.jpg'))

def save_face(image_path, face_region, face_id):
    """Save extracted face to a file."""
    face_image = Image.fromarray(face_region)
    face_image.save(f"unique_faces/{face_id}.jpg")

def extract_and_save_unique_faces():
    for img_path in images:
        # Recognize and potentially extract faces
        results = db.recognize(img=img_path, include=['img', 'id'])
        
        if results:
            for result in results:
                face_id = result['id']
                face_img = result['img']  # This is a numpy array (image)
                
                # Check if the face is already in the database with a unique ID
                if not db.query(face_id=face_id):
                    # Save the new unique face
                    save_face(img_path, face_img, face_id)
                    # Add the new face to the database
                    db.add(name=f"face_{face_id}", img=f"unique_faces/{face_id}.jpg")
                else:
                    print(f"Face {face_id} already exists in the database.")
        else:
            print("No faces found in image:", img_path)

# Run the function
extract_and_save_unique_faces()

Explanation:

Initialization: The FaceDB is initialized with a path to store recognized face images. You can select the module (face_recognition or DeepFace) based on your accuracy and performance requirements.
Image Processing: For each image in your specified directory, the script uses db.recognize() to find faces and get their images and IDs.
Face Saving and Checking: If a face is recognized but not found in the database, it's saved in a separate folder and added to the database.
Optimization: By managing database queries efficiently and storing only new faces, this approach tries to balance between redundancy checks and performance.

This script is a starting point and may need to be adapted based on specific needs, such as handling different file types or more sophisticated error handling.

Apr 16 '24 12:04 stanbar

@shhossain I have a folder of images containing multiple faces in each image so how can i extract only unique faces and save them in another folder using this facedb. i did this using deepface and results are not satisfactory but fast, i also tried with face-recognition library which gave me better results but took a lot of time. Here's want I wanted, given a folder of images, we have to extract to faces in each image and check if that face is in unique_faces folder [db] if yes we have to skip it otherwise we have to add it as unique_face to the folder. I'm stuck at the moment i need both accuracy and performance. I think facedb will help me in accomplishing the above task. all i need is faster comparision of face data.

I copy paste your question to ChatGPT, here is the answer:

To achieve your goal of extracting unique faces from a folder of images and ensuring each face is only saved once, using the FaceDB library, you can follow this script. This script assumes that you have a setup for either face_recognition or DeepFace as described in the FaceDB documentation and that you are using a filesystem-based storage for simplicity.

Here is a Python script that will:
1. Load all images from a specified directory.

2. Extract faces from each image.

3. Check if each extracted face is already in a "unique_faces" folder/database.

4. If not already present, save the new unique face.
from facedb import FaceDB
from glob import glob
import os
from PIL import Image
import numpy as np
import cv2

# Initialize the FaceDB
db = FaceDB(path="unique_faces", module='face_recognition')  # Use 'module' parameter as per your choice

# Load images from folder
image_folder = 'path_to_images_folder'  # Update with your folder path
images = glob(os.path.join(image_folder, '*.jpg'))

def save_face(image_path, face_region, face_id):
    """Save extracted face to a file."""
    face_image = Image.fromarray(face_region)
    face_image.save(f"unique_faces/{face_id}.jpg")

def extract_and_save_unique_faces():
    for img_path in images:
        # Recognize and potentially extract faces
        results = db.recognize(img=img_path, include=['img', 'id'])
        
        if results:
            for result in results:
                face_id = result['id']
                face_img = result['img']  # This is a numpy array (image)
                
                # Check if the face is already in the database with a unique ID
                if not db.query(face_id=face_id):
                    # Save the new unique face
                    save_face(img_path, face_img, face_id)
                    # Add the new face to the database
                    db.add(name=f"face_{face_id}", img=f"unique_faces/{face_id}.jpg")
                else:
                    print(f"Face {face_id} already exists in the database.")
        else:
            print("No faces found in image:", img_path)

# Run the function
extract_and_save_unique_faces()
Explanation:
* **Initialization**: The FaceDB is initialized with a path to store recognized face images. You can select the module (`face_recognition` or `DeepFace`) based on your accuracy and performance requirements.

* **Image Processing**: For each image in your specified directory, the script uses `db.recognize()` to find faces and get their images and IDs.

* **Face Saving and Checking**: If a face is recognized but not found in the database, it's saved in a separate folder and added to the database.

* **Optimization**: By managing database queries efficiently and storing only new faces, this approach tries to balance between redundancy checks and performance.
This script is a starting point and may need to be adapted based on specific needs, such as handling different file types or more sophisticated error handling.

Thanks for the response, I actuaclly tried the above code, and the results are bad. I have a small folder of 8 images and each image is a group photo may be some are blurred in some images, but using the above code, I'm not even getting the faces detected. I got the output like this

No faces found in image: resized-images/20240403_151426.jpg
No faces found in image: resized-images/20240403_151410.jpg
No faces found in image: resized-images/20240403_151432.jpg
No faces found in image: resized-images/20240403_151405.jpg
No faces found in image: resized-images/20240403_151415.jpg

So I'll try to extract all the faces in another folder and then try to pass them to db by checking them. Yes it takes time not not that efficient but i think there isn't a way it seems. for face detection i'm using yolov8 or retina face both are giving me good results but the question is, how accurately the same face recognition library finds the embeddings of those faces as it is unable to detect faces itelf.

Apr 16 '24 13:04 Raghucharan16

Try with check_similar=False.
ids, failed_indexes = db.add_many(
    imgs=imgs,
    names=names,
    check_similar=False
)
Well, it's not necessary what I need, but thank you anyway. I would like to create a similarity search based on all pictures of a person. Kind of create an "average" embeeding of a person based on all their pictures and then search for similarities between people based on the average embeedings. Something like this: https://learn.deeplearning.ai/courses/building-applications-vector-databases/lesson/6/facial-similarity-search

I guess I would have to compare all embeddings of person1 agains all embeddings of person2 and compute the average distance.

Oh, I see now! I misunderstood your issue.

For your use-case you can first extract only embeddings of a face with

db.embedding_func(img)

For getting average you can use numpy

same_person_all_embeddings = [[....],[...]]
average_embeddings = np.mean(same_person_all_embeddings , axis=0)

Now you can add the average embeddings normally and use recognize with top_k.

db.recognize(img, top_k=10) # will return 10 similar images just remove the first one

Apr 16 '24 14:04 shhossain

is there a way of adding our own metadata when uploading to pinecone ?

May 29 '24 13:05 Ntchinda-Giscard

Try with check_similar=False.
ids, failed_indexes = db.add_many(
    imgs=imgs,
    names=names,
    check_similar=False
)
Well, it's not necessary what I need, but thank you anyway. I would like to create a similarity search based on all pictures of a person. Kind of create an "average" embeeding of a person based on all their pictures and then search for similarities between people based on the average embeedings. Something like this: https://learn.deeplearning.ai/courses/building-applications-vector-databases/lesson/6/facial-similarity-search I guess I would have to compare all embeddings of person1 agains all embeddings of person2 and compute the average distance.
Oh, I see now! I misunderstood your issue.

For your use-case you can first extract only embeddings of a face with
db.embedding_func(img)
For getting average you can use numpy
same_person_all_embeddings = [[....],[...]]
average_embeddings = np.mean(same_person_all_embeddings , axis=0)
Now you can add the average embeddings normally and use recognize with top_k.
db.recognize(img, top_k=10) # will return 10 similar images just remove the first one

What is default model used for getting the embeddings from the face? Like Facenet, VGG-Face, GhostFaceNet,etc

May 29 '24 13:05 Raghucharan16

@Ntchinda-Giscard for single image you can do this: Suppose you want to add age or gender

db.add(name='Test 1',img=img,age=20,gender='Male', add_anything_key='add_anything_value')

For multiple image

db.add_many(names=['Test 1', 'Test 2'],imgs=[test1img,test2img],metadata=[{'age':20,'gender':'Male'},{'age':21,'gender':'Female'}])

Here is the the source code references: add, add_many

May 29 '24 14:05 shhossain

@Raghucharan16 by default it uses Face Recognation Python Library Model

May 29 '24 14:05 shhossain