jekyll Identify smiles errors - Facial Recognition in Historical Photographs with Artificial Intelligence in Python

I am planning on using this lesson in a workshop on the 3rd June 2025 but am unable to get some elements to work. Am I missing something in the set-up? Here is what I did.

From the Jupyter notebook linked to I have gone to the repository and opened Colab from there:

The runtime set-up is as follows - there is no specific GPU specified in the lesson just GPU so have gone for free T4:

The code runs fine until it gets to the code block for Identify Smiles: Code

On running this section the following is output:

/content/yearbook/yearbook An error happened An error happened An error happened An error happened An error happened .... 57

The following is then in the output file:

May 13 '25 11:05 quirksahern

Hello @quirksahern, thank you for getting in touch.

I tried running the notebook on my side using the same settings, and I got the same results as you.

However, I then tried again using the following settings instead:

With Python 3 on CPU, I did not get these error results. In the output for the cell Identify Smiles: Code, I did get a few An error happened lines, but only 17 times out of many more lines of successful results.

The downloaded CSV contains the following results:

Years,Smiles,Non-Smiles,Error Weight
1911,0.0,1.0,0.0
1921,0.07692307692307693,0.9230769230769231,0.0
1931,0.05,0.85,0.1
1941,0.0,0.8,0.2
1951,0.27972027972027974,0.6888111888111889,0.03146853146853147
1961,0.40350877192982454,0.5614035087719298,0.03508771929824561

But this isn't ideal, because the Identify Smiles: Code section is where DeepFace.analyze() is called, which is the most computationally intensive part and the one that would benefit the most from a GPU rather than the CPU!

Thank you for pointing this out, @quirksahern. We will work on a solution.

May 14 '25 08:05 charlottejmc

Notes on my debugging work:

I added a few lines to the Identify Smiles: Code block in order to pull out a specific error message when using T4 GPU. This is what all the results look like:

Arguments received by Conv2D.call():
  • inputs=tf.Tensor(shape=(1, 48, 48, 1), dtype=float32)
Analyzing: ./images/1961/1961 faces/1_page-55.png
An error happened while analyzing ./images/1961/1961 faces/1_page-55.png: Exception encountered when calling Conv2D.call().

{{function_node __wrapped__Conv2D_device_/job:localhost/replica:0/task:0/device:GPU:0}} No DNN in stream executor. [Op:Conv2D]

ChatGPT (who is helping my non-expert self navigate this!) suggests:

What “No DNN in stream executor” Means"

When TensorFlow runs a Conv2D on GPU it tries to call into NVIDIA’s cuDNN library (the “DNN”). The error No DNN in stream executor. [Op:Conv2D] means that, at runtime, TensorFlow did not find a compatible cuDNN implementation for your GPU-side convolution and therefore cannot perform the operation on the GPU.

On Colab’s T4 instance you normally have CUDA and cuDNN installed, but your TensorFlow build may have been compiled against a different cuDNN version than Colab provides. For example, many users of TF 2.17+ saw that their binary was built with cuDNN 9.3.0 but Colab was loading cuDNN 9.1.0, so every convolution on the GPU failed with this same message.

Why It’s Happening in Your Notebook

TensorFlow–cuDNN Mismatch: Your notebook’s tensorflow package was likely installed or upgraded (perhaps indirectly by DeepFace) to a version that expects a newer cuDNN.

Colab’s pre-installed GPU drivers and cuDNN may not match that expectation, so the DNN kernels aren’t available in the GPU “stream executor.”

Enforce_detection=False

Disabling face detection (with enforce_detection=False) still passes images through the emotion network, so you still hit that same Conv2D path.

How to Fix It

Downgrade TensorFlow to Match Colab’s cuDNN

Colab’s GPU setup tends to work perfectly with TensorFlow 2.15.0. Downgrading often restores compatibility:

!pip install tensorflow==2.15.0

Then restart the runtime so that the matching cuDNN libraries load correctly .

Upgrade to a Nightly TensorFlow with Newer cuDNN Support

If you want to stick with a more recent TensorFlow, install the nightly build which bundles newer CUDA/cuDNN compatibility:

!pip install tf-nightly

That resolves the missing DNN kernels in TF 2.19+ environments.

Recommendation: The quickest, most reliable route is to pin TensorFlow 2.15.0 in your Colab environment, then restart. That will restore GPU-accelerated convolutions without further tinkering.

I first tried Solution 2 by adding a new cell with !pip install --upgrade tf-nightly at the very beginning. However, this resulted in the same error message as before, and I didn't have capacity to further debug this.

I then tried Solution 1 by specifying the version of tensorflow to use, like this:

%%capture
%mkdir yearbook
%cd yearbook
!pip install --upgrade --no-cache-dir gdown
!gdown --id "1NHT8NN8ClBEnUC5VqkP3wr2KhyiIQzyU"
!unzip PHfiles.zip
%mkdir images
!pip install PyMuPDF
!pip install dlib
!pip install DeepFace
!pip install tensorflow==2.15.0
import os, shutil, fitz, cv2, numpy as np, pandas as pd, dlib, tensorflow as tf
from os.path import dirname, join
from deepface import DeepFace

With this specification, the Identify Smiles: Code block worked! And the results in the CSV file were similar to the ones I got when running on my CPU:

Years,Smiles,Non-Smiles,Error Weight
1911,0.0,1.0,0.0
1921,0.0,0.9230769230769231,0.07692307692307693
1931,0.1,0.85,0.05
1941,0.0,0.8333333333333334,0.16666666666666666
1951,0.2727272727272727,0.6958041958041958,0.03146853146853147
1961,0.42105263157894735,0.543859649122807,0.03508771929824561

Hello @hawc2 (and perhaps @c-goldberg, if you are around!), what do you think? Does this seem like a sensible solution to the problem?

i.e. Simply to specify the tensorflow version (2.15.0) in the first block of code.

The issue I anticipate with this is that we don't know whether this specific version will remain compatible in the long term. Ideally, the suggested Solution 2 seems better, as it would keep updating the compatibility, but I don't have the expertise to understand what more is needed here.

May 14 '25 10:05 charlottejmc

@charlottejmc this seems like a good solution to me, but I'm curious to hear @c-goldberg's perspective

May 16 '25 12:05 hawc2

Hi Charlotte,

Thanks for digging into this! I think I've reached the same position: I don't know enough about TensorFlow to know why solution #2 (which seems preferable to me) isn't working. I think solution #1 is a fine one for now. Hopefully Colab continues to support this version of TF.

Best, Charlie

On Fri, May 16, 2025 at 7:53 AM Alex Wermer-Colan @.***> wrote:

hawc2 left a comment (programminghistorian/jekyll#3558) https://github.com/programminghistorian/jekyll/issues/3558#issuecomment-2886640598

@charlottejmc https://github.com/charlottejmc this seems like a good solution to me, but I'm curious to hear @c-goldberg https://github.com/c-goldberg's perspective

— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/jekyll/issues/3558#issuecomment-2886640598, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJQTHPWBVJKBKRCB4B5O36L26XNVJAVCNFSM6AAAAAB5AJWAC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQOBWGY2DANJZHA . You are receiving this because you were mentioned.Message ID: @.***>

-- Charlie Goldberg Associate Professor of History Department of History, Philosophy, and Political Science Bethel University 3900 Bethel Drive Saint Paul, Minnesota 55112

May 16 '25 20:05 c-goldberg

Thank you for your feedback @c-goldberg!

Hello @quirksahern, we're working on updating the live lesson and notebook at the moment.

In the mean time, please feel free to adapt the steps as agreed by Charlie i.e. to specify the tensorflow version (2.15.0) in the first block of code.

Thank you again for reaching out to us about this! ✨

May 20 '25 09:05 charlottejmc

A maasive thank you, I'm planning on using this with students in the next couple of weeks. The plan is to really dig into to soem tech ethics.

May 21 '25 11:05 quirksahern

For info when I try to specify the version I then get an error with the following line, so may just run as CPU:

ValueError Traceback (most recent call last)

in <cell line: 0>() 9 get_ipython().system('pip install DeepFace') 10 get_ipython().system('pip install tensorflow==2.15.0') ---> 11 import os, shutil, fitz, cv2, numpy as np, pandas as pd, dlib, tensorflow as tf 12 from os.path import dirname, join 13 from deepface import DeepFace

6 frames

/usr/local/lib/python3.11/dist-packages/numpy/random/_pickle.py in ----> 1 from .mtrand import RandomState 2 from ._philox import Philox 3 from ._pcg64 import PCG64, PCG64DXSM 4 from ._sfc64 import SFC64 5

numpy/random/mtrand.pyx in init numpy.random.mtrand()

ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

May 21 '25 15:05 quirksahern

Yes, I'm seeing that now, too. In the next couple of weeks, I'd like to find a better resolution for these errors that maintains GPU use since the lesson discusses their importance in objection detection workflows, but for now, forcing Colab to use CPU instead does allow for successful execution, so that seems like a good solution for now.

Adding this to immediately after the import line (which is line 12) works for me: os.environ["CUDA_VISIBLE_DEVICES"] = "-1"

On Wed, May 21, 2025 at 10:16 AM quirksahern @.***> wrote:

quirksahern left a comment (programminghistorian/jekyll#3558) https://github.com/programminghistorian/jekyll/issues/3558#issuecomment-2898338161

For info when I try to specify the version I then get an error with the following line, so may just run as CPU:

ValueError Traceback (most recent call last)

https://localhost:8080/# in <cell line: 0>() 9 get_ipython().system('pip install DeepFace') 10 get_ipython().system('pip install tensorflow==2.15.0') ---> 11 import os, shutil, fitz, cv2, numpy as np, pandas as pd, dlib, tensorflow as tf 12 from os.path import dirname, join 13 from deepface import DeepFace

6 frames

/usr/local/lib/python3.11/dist-packages/numpy/random/_pickle.py https://localhost:8080/# in ----> 1 from .mtrand import RandomState 2 from ._philox import Philox 3 from ._pcg64 import PCG64, PCG64DXSM 4 from ._sfc64 import SFC64 5

numpy/random/mtrand.pyx in init numpy.random.mtrand()

ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/jekyll/issues/3558#issuecomment-2898338161, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJQTHPTMZV65UNZKMU6A5K327SKF5AVCNFSM6AAAAAB5AJWAC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQOJYGMZTQMJWGE . You are receiving this because you were mentioned.Message ID: @.***>

-- Charlie Goldberg Associate Professor of History Department of History, Philosophy, and Political Science Bethel University 3900 Bethel Drive Saint Paul, Minnesota 55112

May 21 '25 17:05 c-goldberg

Hi again @c-goldberg, could you please explain a little further how you use os.environ["CUDA_VISIBLE_DEVICES"] = "-1", and what it does? Is it in conjunction with the !pip install tensorflow==2.15.0 specification?

I'd like to put up a temporary warning message for our readers until we can find a more durable resolution.

Thank you so much for your time!

May 30 '25 08:05 charlottejmc

Dear Charlie @c-goldberg,

I hope you have been well, and that you are enjoying the summer!

I was hoping to hear back from you about this tensorflow issue in Facial Recognition in Historical Photographs with Artificial Intelligence in Python. Back in May, we were really close to finding a workable fix, before it suddenly stopped reliably working!

Would you perhaps have some time to dedicate over the next few weeks to collaborate on a solution, or at least a clear warning message that lets readers know what to expect?

Thank you very much in advance for your help. ✨

Aug 08 '25 13:08 charlottejmc

Hi Charlotte,

Thanks so much for the reminder. Yes! I will put this on my to do list over the next few weeks. I'll be in touch.

Best, Charlie

On Fri, Aug 8, 2025 at 8:35 AM charlottejmc @.***> wrote:

charlottejmc left a comment (programminghistorian/jekyll#3558) https://github.com/programminghistorian/jekyll/issues/3558#issuecomment-3167937668

Dear Charlie @c-goldberg https://github.com/c-goldberg,

I hope you have been well, and that you are enjoying the summer!

I was hoping to hear back from you about this tensorflow issue in Facial Recognition in Historical Photographs with Artificial Intelligence in Python. Back in May, we were really close to finding a workable fix, before it suddenly stopped reliably working!

Would you perhaps have some time to dedicate over the next few weeks to collaborate on a solution, or at least a clear warning message that lets readers know what to expect?

Thank you very much in advance for your help. ✨

— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/jekyll/issues/3558#issuecomment-3167937668, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJQTHPWHFVYSFP2PCJPRDET3MSRSFAVCNFSM6AAAAAB5AJWAC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTCNRXHEZTONRWHA . You are receiving this because you were mentioned.Message ID: @.***>

-- Charlie Goldberg Associate Professor of History Department of History, Philosophy, and Political Science Bethel University 3900 Bethel Drive Saint Paul, Minnesota 55112

Aug 27 '25 18:08 c-goldberg

Many thanks, Charlie @c-goldberg. We'd be enormously grateful for your support with this.

All best, Anisa

Sep 10 '25 11:09 anisa-hawes

Hi all,

Thanks so much for your patience with this. I can confirm that the temporary workaround continues to work: adding os.environ[ "CUDA_VISIBLE_DEVICES"] = "-1" to line 13 does allow successful execution of the program. This instructs Colab to disable use of the GPU, and use the CPU instead. It does not require including !pip install tensorflow==2.15.0 to force the program to use an older version of tensorflow. For now, I think adding that line of code to the Colab file is sufficient. I would agree that including this disclaimer alerting readers that the process is not working ideally would be helpful for now: "Due to a compatibility issue between TensorFlow and Google Colab, we've disabled GPU use to allow the program to execute successfully. In the meantime, we're working on a permanent solution that continues to use the GPU as described in the lesson."

Unfortunately, I'll need to continue to dig into the deeper problem as to why the program fails to execute with the latest version of TensorFlow. I notice that Colab does not provide much debugging info, which is making it difficult to see exactly what is happening. Charlotte, your first message from May mentions including adding some code in order to see a more detailed error message. Can you share with me what you added? It seems like this may be a problem that's beyond my Python abilities, but I'm hoping I can find a satisfactory solution.

Best, Charlie

On Wed, Sep 10, 2025 at 6:09 AM Anisa Hawes @.***> wrote:

anisa-hawes left a comment (programminghistorian/jekyll#3558) https://github.com/programminghistorian/jekyll/issues/3558#issuecomment-3274469440

Many thanks, Charlie @c-goldberg https://github.com/c-goldberg. We'd be enormously grateful for your support with this.

All best, Anisa

— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/jekyll/issues/3558#issuecomment-3274469440, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJQTHPTPJVSSD2FU66ZIYL33SABFPAVCNFSM6AAAAAB5AJWAC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTENZUGQ3DSNBUGA . You are receiving this because you were mentioned.Message ID: @.***>

-- Charlie Goldberg Associate Professor of History Department of History, Philosophy, and Political Science Bethel University 3900 Bethel Drive Saint Paul, Minnesota 55112

Sep 15 '25 19:09 c-goldberg

Hello Charlie @c-goldberg,

Thank you for your note, and for the time you've already dedicated to investigating this. We are grateful for your commitment to the sustainability of your lesson.

First of all, just to explain that Charlotte has moved on from our team in the past month, so unfortunately she isn't not here to answer your question about which lines of code were added to generate the more detailed error report she mentions. As Charlotte indicates, the debugging work she did was guided by ChatGPT and I intuit that any additional code may have been generated by ChatGPT too.

I think the most sensible next step might be for me to add a general note at the header of the lesson which indicates that we've identified a problem and share a link back to the discussion in this issue. That way, learners can review this conversation and continue their own troubleshooting.

If the addition of the line os.environ["CUDA_VISIBLE_DEVICES"] = "-1" would make the lesson usable as it is, I think that could be worth us adding. (Does this line disable the GPU?)

Before I prepare that, may I ask if you could clarify exactly where this line should be added to the notebook as rendered in nbviewer? Am I correct in understanding that you are suggesting this should be added as the final line of the Preliminaries block?

[...]
import os, shutil, fitz, cv2, numpy as np, pandas as pd, dlib, tensorflow as tf
from os.path import dirname, join
from deepface import DeepFace
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"

We can also add the alert message you have suggested. Do you agree this would be best placed at the beginning of the section titled Preparing Your Environment?

With our thanks, Anisa

Sep 18 '25 16:09 anisa-hawes

Hi Anisa,

Thanks for these details. Here are some responses:

Yes, the new line of code disables the GPU for the session. Placing it where you do at the end of Preliminaries makes the most sense.
Also yes to placing the note at the beginning of that section, but I notice that the lesson currently includes a recommendation that the user should ensure that GPU use is enabled: [image: image.png] Since we're disabling the GPU, we should remove that notice.

In the long run, finding a solution that uses the GPU without errors popping up is attractive, but I also wonder if the simpler solution isn't to revise the lesson to keep the GPU disabled in the long term. The total execution time of the program isn't affected by using the CPU instead. Advanced users will certainly want to use a GPU if/when they create their own object detector, but I think they would end up writing such a program from scratch, which makes our question less pressing in my mind. I don't think a revision like this would be onerous, and I would just want to make it clear to the reader that we are doing so to avoid future dependency errors with Python packages. Let me know your thoughts.

Best, Charlie

On Thu, Sep 18, 2025 at 11:54 AM Anisa Hawes @.***> wrote:

anisa-hawes left a comment (programminghistorian/jekyll#3558) https://github.com/programminghistorian/jekyll/issues/3558#issuecomment-3308544946

Hello Charlie @c-goldberg https://github.com/c-goldberg,

Thank you for your note, and for the time you've already dedicated to investigating this. We are grateful for your commitment to the sustainability of your lesson.

First of all, just to explain that Charlotte has moved on from our team in the past month, so unfortunately she isn't not here to answer your question about which lines of code were added to generate the more detailed error report she mentions https://github.com/programminghistorian/jekyll/issues/3558#issuecomment-2879585228. As Charlotte indicates, the debugging work she did was guided by ChatGPT and I intuit that any additional code may have been generated by ChatGPT too.

I think the most sensible next step might be for me to add a general note at the header of the lesson which indicates that we've identified a problem and share a link back to the discussion in this issue. That way, learners can review this conversation and continue their own troubleshooting.

If the addition of the line os.environ["CUDA_VISIBLE_DEVICES"] = "-1" would make the lesson usable as it is, I think that could be worth us adding. (Does this line disable the GPU?)

Before I prepare that, may I ask if you could clarify exactly where this line should be added to the notebook as rendered in nbviewer https://nbviewer.org/github/programminghistorian/jekyll/blob/gh-pages/assets/facial-recognition-ai-python/facial-recognition-ai-python.ipynb? Am I correct in understanding that you are suggesting this should be added as the final line of the Preliminaries block?

[...] import os, shutil, fitz, cv2, numpy as np, pandas as pd, dlib, tensorflow as tf from os.path import dirname, join from deepface import DeepFace os.environ["CUDA_VISIBLE_DEVICES"] = "-1"

We can also add the alert message you have suggested. Do you agree this would be best placed at the beginning of the section titled Preparing Your Environment https://programminghistorian.org/en/lessons/facial-recognition-ai-python#preparing-your-environment ?

With our thanks, Anisa

— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/jekyll/issues/3558#issuecomment-3308544946, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJQTHPVTENXP2PHRZYTJZW33TLPTXAVCNFSM6AAAAAB5AJWAC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTGMBYGU2DIOJUGY . You are receiving this because you were mentioned.Message ID: @.***>

-- Charlie Goldberg Associate Professor of History Department of History, Philosophy, and Political Science Bethel University 3900 Bethel Drive Saint Paul, Minnesota 55112

Sep 18 '25 18:09 c-goldberg

Many thanks, Charlie @c-goldberg. I really appreciate the thought you've given this.

I'm preparing a PR to make the following short-term adjustments: https://github.com/programminghistorian/jekyll/pull/3643

Add os.environ["CUDA_VISIBLE_DEVICES"] = "-1" as final line of the Preliminaries block
Add new note explaining compatibility problem between TensorFlow and Google Colab / disabling GPU
Remove existing note advising readers to ensure connection to a GPU runtime

You can review the details in rich-diff and let me know if you feel anything needs adjustment.

--

We'd be enormously grateful for your help to adapt this lesson to improve its sustainability. If you have capacity to offer us your expertise (and you think the revisions wouldn't be too onerous) then I'd be delighted to collaborate with you to implement any necessary updates. Depending on whether the revisions required are minor or substantial, we can also initiate a conversation with the Managing Editor Alex @hawc2 about whether the new version of the lesson should be assigned a fresh DOI.

Sep 24 '25 16:09 anisa-hawes

jekyll jekyll copied to clipboard

Identify smiles errors - Facial Recognition in Historical Photographs with Artificial Intelligence in Python

jekyll
jekyll copied to clipboard