face_recognition face_recognition throws a CUDA exception, but only in a Celery task

face_recognition throws a CUDA exception, but only in a Celery task

Open pmi123 opened this issue 1 year ago • 0 comments

face_recognition version: face-recognition==1.3.0, face-recognition-models==0.3.0
Python version: Python 3.8.16
Operating System: Ubuntu 22.04.2 LTS

Description

I have a face_recognition task that is part of a larger Django 4.1 site and the face recognition task runs in celery with redis. For the last several years, it has run with no issues on older versions of Ubuntu, celery, django, redis, face-recognition. I moved the app to an upgraded server and upgraded all the requirements. If I run the task without celery on the upgraded server, it finds faces using either cnn or hog models. When I run the same code, but as a celery task, I get an exception, pointing to an issue with my CUDA installation. However, I tested the CUDA installation with the sample programs, and the tests all pass without errors.

What I Did

All done inside my virtualenv for the Django project. Note, there is a backend mysql database, and the image I have attached has document_id = 3443.

I used the "vanilla" face_recognition code, and both hog and cnn found the faces.

>>> import face_recognition
>>> image = face_recognition.load_image_file("![Priest_Lake_2001](https://user-images.githubusercontent.com/3002127/227678030-da0fd053-372a-44d7-a0a7-55bc7c1ec7db.png)")
>>> face_recognition.face_locations(image, model='cnn')
[(278, 490, 325, 443), (317, 556, 357, 517), (175, 410, 232, 353), (336, 381, 393, 324), (427, 404, 474, 356)]
>>> face_recognition.face_locations(image, model='hog')
[(182, 412, 234, 360), (286, 487, 329, 443), (314, 561, 366, 510), (343, 383, 395, 331), (426, 411, 488, 349)]
face_recognition.face_encodings produce 5 arrays for both hog and cnn.

I tested the find_faces task without using celery, and both cnn and hog found the faces.

>>> from biometric_identification.tasks import find_faces_task
>>> find_faces_task(3443, use_cuda=False)
It found all 5 faces in the photograph.
>>> find_faces_task(3443, use_cuda=True)
It found all 5 faces in the photograph.

I ran the find_faces task using celery, and got this exception:

Using CUDA
>>> from biometric_identification.tasks import find_faces_task
>>> find_faces_task.delay(3443, use_cuda=True)
it generated 5 exceptions like this:
[2023-03-24 17:18:17,484: ERROR/ForkPoolWorker-8] Hit an exception in find_faces_task Error while calling cudaGetDevice(&the_device_id) in file /tmp/pip-install-lwueab1p/dlib_294881559a49454d89cb1b07b2dcb694/dlib/cuda/gpu_data.cpp:204. code: 3, reason: initialization error
Traceback (most recent call last):
  File "/home/mark/python-projects/archive/biometric_identification/tasks.py", line 80, in find_faces_task
    face_locations = face_recognition.face_locations(image, model="cnn", number_of_times_to_upsample=0)
  File "/home/mark/.virtualenvs/archive/lib/python3.8/site-packages/face_recognition/api.py", line 119, in face_locations
    return [_trim_css_to_bounds(_rect_to_css(face.rect), img.shape) for face in _raw_face_locations(img, number_of_times_to_upsample, "cnn")]
  File "/home/mark/.virtualenvs/archive/lib/python3.8/site-packages/face_recognition/api.py", line 103, in _raw_face_locations
    return cnn_face_detector(img, number_of_times_to_upsample)
RuntimeError: Error while calling cudaGetDevice(&the_device_id) in file /tmp/pip-install-lwueab1p/dlib_294881559a49454d89cb1b07b2dcb694/dlib/cuda/gpu_data.cpp:204. code: 3, reason: initialization error

Not using CUDA also produced an exception, but not from face_recognition.face_locations, but from face_recognition.face_encodings:

[2023-03-24 17:49:48,872: ERROR/ForkPoolWorker-8] Hit an exception in find_faces_task Error while calling cudaGetDevice(&the_device_id) in file /tmp/pip-install-lwueab1p/dlib_294881559a49454d89cb1b07b2dcb694/dlib/cuda/gpu_data.cpp:204. code: 3, reason: initialization error
Traceback (most recent call last):
  File "/home/mark/python-projects/archive/biometric_identification/tasks.py", line 92, in find_faces_task
    face_encodings = face_recognition.face_encodings(image, known_face_locations=face_locations)
  File "/home/mark/.virtualenvs/archive/lib/python3.8/site-packages/face_recognition/api.py", line 214, in face_encodings
    return [np.array(face_encoder.compute_face_descriptor(face_image, raw_landmark_set, num_jitters)) for raw_landmark_set in raw_landmarks]
  File "/home/mark/.virtualenvs/archive/lib/python3.8/site-packages/face_recognition/api.py", line 214, in <listcomp>
    return [np.array(face_encoder.compute_face_descriptor(face_image, raw_landmark_set, num_jitters)) for raw_landmark_set in raw_landmarks]
RuntimeError: Error while calling cudaGetDevice(&the_device_id) in file /tmp/pip-install-lwueab1p/dlib_294881559a49454d89cb1b07b2dcb694/dlib/cuda/gpu_data.cpp:204. code: 3, reason: initialization error

The find_faces task:

@app.task(bind=True, base=utils.BaseTaskWithRetry)
def find_faces_task(self, document_id, use_cuda=settings.USE_CUDA):
    logger.debug("find_faces_task in tasks START")
    try:
        temp_file = None
        from memorabilia.models import TaskStatus, Document      
        args = "document_id=%s, use_cuda=%s" % (document_id, use_cuda)
        ts = TaskStatus(document_id_id=document_id, task_id=self.request.id, task_name='find_faces_task', task_args=args, task_status=TaskStatus.PENDING)
        ts.save()
        import time
        time_start = time.time()
        from memorabilia.models import Document
        from biometric_identification.models import Face
        if len(Face.objects.filter(document_id=document_id)) != 0:
            # This document has already been scanned, so need to remove it and rescan
            # Have to manually delete each object per django docs to insure the 
            # model delete method is run to update the metadata.
            logger.debug("Document %s has already been scanned" % document_id)
            faces = Face.objects.filter(document_id=document_id)
            for face in faces:
                face.delete()
                logger.debug("Deleted face=%s" % face.tag_value.value)
        document = Document.objects.get(document_id=document_id)
        image_file = document.get_default_image_file(settings.DEFAULT_DISPLAY_IMAGE)
        image_path = image_file.path
        logger.debug("document_id=%s, image_path=%s" % (document_id, image_path))
        time_start_looking = time.time()
        temp_file = open(image_path, 'rb')
        temp_image = Image.open(temp_file)
        logger.debug("temp_image.mode=%s" % temp_image.mode)
        width, height = temp_image.size
        image = face_recognition.load_image_file(temp_file)
        # Get the coordinates of each face
        if use_cuda:
            # With CUDA installed
            logger.debug("Using CUDA for face recognition")
            face_locations = face_recognition.face_locations(image, model="cnn", number_of_times_to_upsample=0) 
        else:
            # without CUDA installed
            logger.debug("NOT using CUDA for face recognition")
            #face_locations = face_recognition.face_locations(image, number_of_times_to_upsample=2)
            face_locations = face_recognition.face_locations(image, model="hog", number_of_times_to_upsample=2)
        if len(face_locations) == 0:
            ts.task_status = TaskStatus.WARNING
            ts.comment = "Found %s faces" % len(face_locations)
        else:
            time_find_faces = time.time()
            # Get the face encodings for each face in the picture    
            face_encodings = face_recognition.face_encodings(image, known_face_locations=face_locations) 
            logger.debug("Found %s face locations and %s encodings" % (len(face_locations), len(face_encodings)))
            time_face_encodings = time.time()
            # Save the faces found in the database
            for location, encoding in zip(face_locations, face_encodings):
                # Create the new Face object and load in the document, encoding, and location of a face found
                # Locations seem to be of the form (y,x)
                from memorabilia.models import MetaData, MetaDataValue
                tag_type_people = MetaDataValue.objects.filter(metadata_id=MetaData.objects.filter(name='Tag_types')[0].metadata_id, value='People')[0]
                tag_value_unknown = MetaDataValue.objects.filter(metadata_id=MetaData.objects.filter(name='Unknown')[0].metadata_id, value='Unknown')[0]
                new_face = Face(document=document, face_encoding=numpy_to_json(encoding), face_location=location, image_size={'width': width, "height":height}, tag_type=tag_type_people, tag_value=tag_value_unknown)         
                # save the newly found Face object
                new_face.save()
                logger.debug("Saved new_face %s" % new_face.face_file) 
            time_end = time.time()
            logger.debug("total time = {}".format(time_end - time_start))
            logger.debug("time to find faces = {}".format(time_find_faces - time_start_looking))
            logger.debug("time to find encodings = {}".format(time_face_encodings - time_find_faces))
            ts.task_status = TaskStatus.SUCCESS
            ts.comment = "Found %s faces" % len(face_encodings)
        return document_id
    except Exception as e:
        logger.exception("Hit an exception in find_faces_task %s" % str(e))
        ts.task_status = TaskStatus.ERROR
        ts.comment = "An exception while finding faces: %s" % repr(e)
        ts.save(update_fields=['task_status', 'comment'])
        raise Exception("Hit an exception in find_faces_task for document_id=%s and use_cuda=%s" %(document_id, use_cuda)) from e
    finally:
        logger.debug("Finally clause in find-faces_task")
        logger.debug("temp_file=%s" % temp_file)
        logger.debug("temp_image=%s" % temp_image)
        if temp_file:
            temp_file.close()
            logger.debug("closed temp_file=%s" % temp_file)
        if temp_image:
            temp_image.close()
            logger.debug("closed temp_image=%s" % temp_image)
        ts.save(update_fields=['task_status', 'comment'])
        logger.debug("find_faces_task END")

Priest_Lake_2001

Mar 25 '23 01:03 pmi123

face_recognition face_recognition copied to clipboard

face_recognition throws a CUDA exception, but only in a Celery task

Description

What I Did

face_recognition
face_recognition copied to clipboard