jekyll
jekyll copied to clipboard
Identify smiles errors - Facial Recognition in Historical Photographs with Artificial Intelligence in Python
I am planning on using this lesson in a workshop on the 3rd June 2025 but am unable to get some elements to work. Am I missing something in the set-up? Here is what I did.
From the Jupyter notebook linked to I have gone to the repository and opened Colab from there:
The runtime set-up is as follows - there is no specific GPU specified in the lesson just GPU so have gone for free T4:
The code runs fine until it gets to the code block for Identify Smiles: Code
On running this section the following is output:
/content/yearbook/yearbook An error happened An error happened An error happened An error happened An error happened .... 57
The following is then in the output file:
Hello @quirksahern, thank you for getting in touch.
I tried running the notebook on my side using the same settings, and I got the same results as you.
However, I then tried again using the following settings instead:
With Python 3 on CPU, I did not get these error results. In the output for the cell Identify Smiles: Code, I did get a few An error happened lines, but only 17 times out of many more lines of successful results.
The downloaded CSV contains the following results:
Years,Smiles,Non-Smiles,Error Weight
1911,0.0,1.0,0.0
1921,0.07692307692307693,0.9230769230769231,0.0
1931,0.05,0.85,0.1
1941,0.0,0.8,0.2
1951,0.27972027972027974,0.6888111888111889,0.03146853146853147
1961,0.40350877192982454,0.5614035087719298,0.03508771929824561
But this isn't ideal, because the Identify Smiles: Code section is where DeepFace.analyze() is called, which is the most computationally intensive part and the one that would benefit the most from a GPU rather than the CPU!
Thank you for pointing this out, @quirksahern. We will work on a solution.
Notes on my debugging work:
I added a few lines to the Identify Smiles: Code block in order to pull out a specific error message when using T4 GPU. This is what all the results look like:
Arguments received by Conv2D.call():
• inputs=tf.Tensor(shape=(1, 48, 48, 1), dtype=float32)
Analyzing: ./images/1961/1961 faces/1_page-55.png
An error happened while analyzing ./images/1961/1961 faces/1_page-55.png: Exception encountered when calling Conv2D.call().
{{function_node __wrapped__Conv2D_device_/job:localhost/replica:0/task:0/device:GPU:0}} No DNN in stream executor. [Op:Conv2D]
ChatGPT (who is helping my non-expert self navigate this!) suggests:
What “No DNN in stream executor” Means"
When TensorFlow runs a Conv2D on GPU it tries to call into NVIDIA’s cuDNN library (the “DNN”). The error No DNN in stream executor. [Op:Conv2D] means that, at runtime, TensorFlow did not find a compatible cuDNN implementation for your GPU-side convolution and therefore cannot perform the operation on the GPU.
On Colab’s T4 instance you normally have CUDA and cuDNN installed, but your TensorFlow build may have been compiled against a different cuDNN version than Colab provides. For example, many users of TF 2.17+ saw that their binary was built with cuDNN 9.3.0 but Colab was loading cuDNN 9.1.0, so every convolution on the GPU failed with this same message.
Why It’s Happening in Your Notebook
- TensorFlow–cuDNN Mismatch: Your notebook’s tensorflow package was likely installed or upgraded (perhaps indirectly by DeepFace) to a version that expects a newer cuDNN.
Colab’s pre-installed GPU drivers and cuDNN may not match that expectation, so the DNN kernels aren’t available in the GPU “stream executor.”
- Enforce_detection=False
Disabling face detection (with enforce_detection=False) still passes images through the emotion network, so you still hit that same Conv2D path.
How to Fix It
- Downgrade TensorFlow to Match Colab’s cuDNN
Colab’s GPU setup tends to work perfectly with TensorFlow 2.15.0. Downgrading often restores compatibility:
!pip install tensorflow==2.15.0
Then restart the runtime so that the matching cuDNN libraries load correctly .
- Upgrade to a Nightly TensorFlow with Newer cuDNN Support
If you want to stick with a more recent TensorFlow, install the nightly build which bundles newer CUDA/cuDNN compatibility:
!pip install tf-nightly
That resolves the missing DNN kernels in TF 2.19+ environments.
Recommendation: The quickest, most reliable route is to pin TensorFlow 2.15.0 in your Colab environment, then restart. That will restore GPU-accelerated convolutions without further tinkering.
I first tried Solution 2 by adding a new cell with !pip install --upgrade tf-nightly at the very beginning. However, this resulted in the same error message as before, and I didn't have capacity to further debug this.
I then tried Solution 1 by specifying the version of tensorflow to use, like this:
%%capture
%mkdir yearbook
%cd yearbook
!pip install --upgrade --no-cache-dir gdown
!gdown --id "1NHT8NN8ClBEnUC5VqkP3wr2KhyiIQzyU"
!unzip PHfiles.zip
%mkdir images
!pip install PyMuPDF
!pip install dlib
!pip install DeepFace
!pip install tensorflow==2.15.0
import os, shutil, fitz, cv2, numpy as np, pandas as pd, dlib, tensorflow as tf
from os.path import dirname, join
from deepface import DeepFace
With this specification, the Identify Smiles: Code block worked! And the results in the CSV file were similar to the ones I got when running on my CPU:
Years,Smiles,Non-Smiles,Error Weight
1911,0.0,1.0,0.0
1921,0.0,0.9230769230769231,0.07692307692307693
1931,0.1,0.85,0.05
1941,0.0,0.8333333333333334,0.16666666666666666
1951,0.2727272727272727,0.6958041958041958,0.03146853146853147
1961,0.42105263157894735,0.543859649122807,0.03508771929824561
Hello @hawc2 (and perhaps @c-goldberg, if you are around!), what do you think? Does this seem like a sensible solution to the problem?
i.e. Simply to specify the tensorflow version (2.15.0) in the first block of code.
The issue I anticipate with this is that we don't know whether this specific version will remain compatible in the long term. Ideally, the suggested Solution 2 seems better, as it would keep updating the compatibility, but I don't have the expertise to understand what more is needed here.
@charlottejmc this seems like a good solution to me, but I'm curious to hear @c-goldberg's perspective
Hi Charlotte,
Thanks for digging into this! I think I've reached the same position: I don't know enough about TensorFlow to know why solution #2 (which seems preferable to me) isn't working. I think solution #1 is a fine one for now. Hopefully Colab continues to support this version of TF.
Best, Charlie
On Fri, May 16, 2025 at 7:53 AM Alex Wermer-Colan @.***> wrote:
hawc2 left a comment (programminghistorian/jekyll#3558) https://github.com/programminghistorian/jekyll/issues/3558#issuecomment-2886640598
@charlottejmc https://github.com/charlottejmc this seems like a good solution to me, but I'm curious to hear @c-goldberg https://github.com/c-goldberg's perspective
— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/jekyll/issues/3558#issuecomment-2886640598, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJQTHPWBVJKBKRCB4B5O36L26XNVJAVCNFSM6AAAAAB5AJWAC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQOBWGY2DANJZHA . You are receiving this because you were mentioned.Message ID: @.***>
-- Charlie Goldberg Associate Professor of History Department of History, Philosophy, and Political Science Bethel University 3900 Bethel Drive Saint Paul, Minnesota 55112
Thank you for your feedback @c-goldberg!
Hello @quirksahern, we're working on updating the live lesson and notebook at the moment.
In the mean time, please feel free to adapt the steps as agreed by Charlie i.e. to specify the tensorflow version (2.15.0) in the first block of code.
Thank you again for reaching out to us about this! ✨
A maasive thank you, I'm planning on using this with students in the next couple of weeks. The plan is to really dig into to soem tech ethics.
For info when I try to specify the version I then get an error with the following line, so may just run as CPU:
ValueError Traceback (most recent call last)
6 frames
/usr/local/lib/python3.11/dist-packages/numpy/random/_pickle.py in
numpy/random/mtrand.pyx in init numpy.random.mtrand()
ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
Yes, I'm seeing that now, too. In the next couple of weeks, I'd like to find a better resolution for these errors that maintains GPU use since the lesson discusses their importance in objection detection workflows, but for now, forcing Colab to use CPU instead does allow for successful execution, so that seems like a good solution for now.
Adding this to immediately after the import line (which is line 12) works for me: os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
On Wed, May 21, 2025 at 10:16 AM quirksahern @.***> wrote:
quirksahern left a comment (programminghistorian/jekyll#3558) https://github.com/programminghistorian/jekyll/issues/3558#issuecomment-2898338161
For info when I try to specify the version I then get an error with the following line, so may just run as CPU:
ValueError Traceback (most recent call last)
https://localhost:8080/# in <cell line: 0>() 9 get_ipython().system('pip install DeepFace') 10 get_ipython().system('pip install tensorflow==2.15.0') ---> 11 import os, shutil, fitz, cv2, numpy as np, pandas as pd, dlib, tensorflow as tf 12 from os.path import dirname, join 13 from deepface import DeepFace
6 frames
/usr/local/lib/python3.11/dist-packages/numpy/random/_pickle.py https://localhost:8080/# in ----> 1 from .mtrand import RandomState 2 from ._philox import Philox 3 from ._pcg64 import PCG64, PCG64DXSM 4 from ._sfc64 import SFC64 5
numpy/random/mtrand.pyx in init numpy.random.mtrand()
ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/jekyll/issues/3558#issuecomment-2898338161, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJQTHPTMZV65UNZKMU6A5K327SKF5AVCNFSM6AAAAAB5AJWAC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQOJYGMZTQMJWGE . You are receiving this because you were mentioned.Message ID: @.***>
-- Charlie Goldberg Associate Professor of History Department of History, Philosophy, and Political Science Bethel University 3900 Bethel Drive Saint Paul, Minnesota 55112
Hi again @c-goldberg, could you please explain a little further how you use os.environ["CUDA_VISIBLE_DEVICES"] = "-1", and what it does? Is it in conjunction with the !pip install tensorflow==2.15.0 specification?
I'd like to put up a temporary warning message for our readers until we can find a more durable resolution.
Thank you so much for your time!
Dear Charlie @c-goldberg,
I hope you have been well, and that you are enjoying the summer!
I was hoping to hear back from you about this tensorflow issue in Facial Recognition in Historical Photographs with Artificial Intelligence in Python. Back in May, we were really close to finding a workable fix, before it suddenly stopped reliably working!
Would you perhaps have some time to dedicate over the next few weeks to collaborate on a solution, or at least a clear warning message that lets readers know what to expect?
Thank you very much in advance for your help. ✨
Hi Charlotte,
Thanks so much for the reminder. Yes! I will put this on my to do list over the next few weeks. I'll be in touch.
Best, Charlie
On Fri, Aug 8, 2025 at 8:35 AM charlottejmc @.***> wrote:
charlottejmc left a comment (programminghistorian/jekyll#3558) https://github.com/programminghistorian/jekyll/issues/3558#issuecomment-3167937668
Dear Charlie @c-goldberg https://github.com/c-goldberg,
I hope you have been well, and that you are enjoying the summer!
I was hoping to hear back from you about this tensorflow issue in Facial Recognition in Historical Photographs with Artificial Intelligence in Python. Back in May, we were really close to finding a workable fix, before it suddenly stopped reliably working!
Would you perhaps have some time to dedicate over the next few weeks to collaborate on a solution, or at least a clear warning message that lets readers know what to expect?
Thank you very much in advance for your help. ✨
— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/jekyll/issues/3558#issuecomment-3167937668, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJQTHPWHFVYSFP2PCJPRDET3MSRSFAVCNFSM6AAAAAB5AJWAC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTCNRXHEZTONRWHA . You are receiving this because you were mentioned.Message ID: @.***>
-- Charlie Goldberg Associate Professor of History Department of History, Philosophy, and Political Science Bethel University 3900 Bethel Drive Saint Paul, Minnesota 55112
Many thanks, Charlie @c-goldberg. We'd be enormously grateful for your support with this.
All best, Anisa
Hi all,
Thanks so much for your patience with this. I can confirm that the temporary workaround continues to work: adding os.environ[ "CUDA_VISIBLE_DEVICES"] = "-1" to line 13 does allow successful execution of the program. This instructs Colab to disable use of the GPU, and use the CPU instead. It does not require including !pip install tensorflow==2.15.0 to force the program to use an older version of tensorflow. For now, I think adding that line of code to the Colab file is sufficient. I would agree that including this disclaimer alerting readers that the process is not working ideally would be helpful for now: "Due to a compatibility issue between TensorFlow and Google Colab, we've disabled GPU use to allow the program to execute successfully. In the meantime, we're working on a permanent solution that continues to use the GPU as described in the lesson."
Unfortunately, I'll need to continue to dig into the deeper problem as to why the program fails to execute with the latest version of TensorFlow. I notice that Colab does not provide much debugging info, which is making it difficult to see exactly what is happening. Charlotte, your first message from May mentions including adding some code in order to see a more detailed error message. Can you share with me what you added? It seems like this may be a problem that's beyond my Python abilities, but I'm hoping I can find a satisfactory solution.
Best, Charlie
On Wed, Sep 10, 2025 at 6:09 AM Anisa Hawes @.***> wrote:
anisa-hawes left a comment (programminghistorian/jekyll#3558) https://github.com/programminghistorian/jekyll/issues/3558#issuecomment-3274469440
Many thanks, Charlie @c-goldberg https://github.com/c-goldberg. We'd be enormously grateful for your support with this.
All best, Anisa
— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/jekyll/issues/3558#issuecomment-3274469440, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJQTHPTPJVSSD2FU66ZIYL33SABFPAVCNFSM6AAAAAB5AJWAC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTENZUGQ3DSNBUGA . You are receiving this because you were mentioned.Message ID: @.***>
-- Charlie Goldberg Associate Professor of History Department of History, Philosophy, and Political Science Bethel University 3900 Bethel Drive Saint Paul, Minnesota 55112
Hello Charlie @c-goldberg,
Thank you for your note, and for the time you've already dedicated to investigating this. We are grateful for your commitment to the sustainability of your lesson.
First of all, just to explain that Charlotte has moved on from our team in the past month, so unfortunately she isn't not here to answer your question about which lines of code were added to generate the more detailed error report she mentions. As Charlotte indicates, the debugging work she did was guided by ChatGPT and I intuit that any additional code may have been generated by ChatGPT too.
I think the most sensible next step might be for me to add a general note at the header of the lesson which indicates that we've identified a problem and share a link back to the discussion in this issue. That way, learners can review this conversation and continue their own troubleshooting.
If the addition of the line os.environ["CUDA_VISIBLE_DEVICES"] = "-1" would make the lesson usable as it is, I think that could be worth us adding. (Does this line disable the GPU?)
Before I prepare that, may I ask if you could clarify exactly where this line should be added to the notebook as rendered in nbviewer? Am I correct in understanding that you are suggesting this should be added as the final line of the Preliminaries block?
[...]
import os, shutil, fitz, cv2, numpy as np, pandas as pd, dlib, tensorflow as tf
from os.path import dirname, join
from deepface import DeepFace
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
We can also add the alert message you have suggested. Do you agree this would be best placed at the beginning of the section titled Preparing Your Environment?
With our thanks, Anisa
Hi Anisa,
Thanks for these details. Here are some responses:
- Yes, the new line of code disables the GPU for the session. Placing it where you do at the end of Preliminaries makes the most sense.
- Also yes to placing the note at the beginning of that section, but I notice that the lesson currently includes a recommendation that the user should ensure that GPU use is enabled: [image: image.png] Since we're disabling the GPU, we should remove that notice.
In the long run, finding a solution that uses the GPU without errors popping up is attractive, but I also wonder if the simpler solution isn't to revise the lesson to keep the GPU disabled in the long term. The total execution time of the program isn't affected by using the CPU instead. Advanced users will certainly want to use a GPU if/when they create their own object detector, but I think they would end up writing such a program from scratch, which makes our question less pressing in my mind. I don't think a revision like this would be onerous, and I would just want to make it clear to the reader that we are doing so to avoid future dependency errors with Python packages. Let me know your thoughts.
Best, Charlie
On Thu, Sep 18, 2025 at 11:54 AM Anisa Hawes @.***> wrote:
anisa-hawes left a comment (programminghistorian/jekyll#3558) https://github.com/programminghistorian/jekyll/issues/3558#issuecomment-3308544946
Hello Charlie @c-goldberg https://github.com/c-goldberg,
Thank you for your note, and for the time you've already dedicated to investigating this. We are grateful for your commitment to the sustainability of your lesson.
First of all, just to explain that Charlotte has moved on from our team in the past month, so unfortunately she isn't not here to answer your question about which lines of code were added to generate the more detailed error report she mentions https://github.com/programminghistorian/jekyll/issues/3558#issuecomment-2879585228. As Charlotte indicates, the debugging work she did was guided by ChatGPT and I intuit that any additional code may have been generated by ChatGPT too.
I think the most sensible next step might be for me to add a general note at the header of the lesson which indicates that we've identified a problem and share a link back to the discussion in this issue. That way, learners can review this conversation and continue their own troubleshooting.
If the addition of the line os.environ["CUDA_VISIBLE_DEVICES"] = "-1" would make the lesson usable as it is, I think that could be worth us adding. (Does this line disable the GPU?)
Before I prepare that, may I ask if you could clarify exactly where this line should be added to the notebook as rendered in nbviewer https://nbviewer.org/github/programminghistorian/jekyll/blob/gh-pages/assets/facial-recognition-ai-python/facial-recognition-ai-python.ipynb? Am I correct in understanding that you are suggesting this should be added as the final line of the Preliminaries block?
[...] import os, shutil, fitz, cv2, numpy as np, pandas as pd, dlib, tensorflow as tf from os.path import dirname, join from deepface import DeepFace os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
We can also add the alert message you have suggested. Do you agree this would be best placed at the beginning of the section titled Preparing Your Environment https://programminghistorian.org/en/lessons/facial-recognition-ai-python#preparing-your-environment ?
With our thanks, Anisa
— Reply to this email directly, view it on GitHub https://github.com/programminghistorian/jekyll/issues/3558#issuecomment-3308544946, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJQTHPVTENXP2PHRZYTJZW33TLPTXAVCNFSM6AAAAAB5AJWAC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTGMBYGU2DIOJUGY . You are receiving this because you were mentioned.Message ID: @.***>
-- Charlie Goldberg Associate Professor of History Department of History, Philosophy, and Political Science Bethel University 3900 Bethel Drive Saint Paul, Minnesota 55112
Many thanks, Charlie @c-goldberg. I really appreciate the thought you've given this.
I'm preparing a PR to make the following short-term adjustments: https://github.com/programminghistorian/jekyll/pull/3643
- Add
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"as final line of the Preliminaries block - Add new note explaining compatibility problem between TensorFlow and Google Colab / disabling GPU
- Remove existing note advising readers to ensure connection to a GPU runtime
You can review the details in rich-diff and let me know if you feel anything needs adjustment.
--
We'd be enormously grateful for your help to adapt this lesson to improve its sustainability. If you have capacity to offer us your expertise (and you think the revisions wouldn't be too onerous) then I'd be delighted to collaborate with you to implement any necessary updates. Depending on whether the revisions required are minor or substantial, we can also initiate a conversation with the Managing Editor Alex @hawc2 about whether the new version of the lesson should be assigned a fresh DOI.