os icon indicating copy to clipboard operation
os copied to clipboard

python-3.{10,11}: enable FIPS compatible multiprocessing

Open javacruft opened this issue 1 month ago • 13 comments

Pick a patch from DataDog's Python 3.11 + 3.10 branches to default multiprocessing to use SHA256, with fallback to MD5 if needed/available.

This is based on the changes made for 3.12+ and authored by the same Python core developer.

javacruft avatar Nov 26 '25 17:11 javacruft

Looks good to me i appreciate thta it's smaller. I wish we would number our patches though I'd like to start making development work for some of this easier a long the lines of: https://github.com/chainguard-dev/melange/pull/2170

justinvreeland avatar Nov 26 '25 17:11 justinvreeland

Also tested with Chainguard's FIPS openssl configurations; test fails with the current versions of both packages and passes with the updates in this PR.

Python 3.10

6a83619e8b60:/tmp# apk add python-3.10
(1/15) Installing py3-pip-wheel (25.3-r2)
(2/15) Installing py3-setuptools-wheel (80.9.0-r3)
(3/15) Installing libbz2-1 (1.0.8-r21)
(4/15) Installing libexpat1 (2.7.3-r0)
(5/15) Installing libffi (3.5.2-r1)
(6/15) Installing gdbm (1.26-r1)
(7/15) Installing xz (5.8.1-r6)
(8/15) Installing libstdc++ (15.2.0-r6)
(9/15) Installing mpdecimal (4.0.1-r3)
(10/15) Installing ncurses-terminfo-base (6.5_p20251025-r1)
(11/15) Installing ncurses (6.5_p20251025-r1)
(12/15) Installing readline (8.3-r1)
(13/15) Installing sqlite-libs (3.51.0-r0)
(14/15) Installing python-3.10-base (3.10.19-r3)
(15/15) Installing python-3.10 (3.10.19-r3)
Executing busybox-1.37.0-r50.trigger
OK: 61 MiB in 34 packages
6a83619e8b60:/tmp# python3 test.py 
Process 0: Starting
Process 1: Starting
Process 2: Starting
Process 3: Starting
Process 4: Starting

All processes finished.
Final shared list: [0, 1, 2, 3, 4]
List length: 5
6a83619e8b60:/tmp# apk add python-3.10=3.10.19-r2
(1/2) Downgrading python-3.10-base (3.10.19-r3 -> 3.10.19-r2)
(2/2) Downgrading python-3.10 (3.10.19-r3 -> 3.10.19-r2)
Executing busybox-1.37.0-r50.trigger
OK: 61 MiB in 34 packages
6a83619e8b60:/tmp# python3 test.py 
Traceback (most recent call last):
  File "/tmp/test.py", line 10, in <module>
    shared_list = manager.list()
  File "/usr/lib/python3.10/multiprocessing/managers.py", line 723, in temp
    token, exp = self._create(typeid, *args, **kwds)
  File "/usr/lib/python3.10/multiprocessing/managers.py", line 606, in _create
    conn = self._Client(self._address, authkey=self._authkey)
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 508, in Client
    answer_challenge(c, authkey)
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 755, in answer_challenge
    digest = hmac.new(authkey, message, 'md5').digest()
  File "/usr/lib/python3.10/hmac.py", line 184, in new
    return HMAC(key, msg, digestmod)
  File "/usr/lib/python3.10/hmac.py", line 60, in __init__
    self._init_hmac(key, msg, digestmod)
  File "/usr/lib/python3.10/hmac.py", line 67, in _init_hmac
    self._hmac = _hashopenssl.hmac_new(key, msg, digestmod=digestmod)
ValueError: [digital envelope routines] unsupported

Python 3.11

6a83619e8b60:/tmp# apk add python-3.11=3.11.14-r2
(1/2) Downgrading python-3.11-base (3.11.14-r3 -> 3.11.14-r2)
(2/2) Downgrading python-3.11 (3.11.14-r3 -> 3.11.14-r2)
Executing busybox-1.37.0-r50.trigger
OK: 69 MiB in 34 packages
6a83619e8b60:/tmp# python3 test.py 
Traceback (most recent call last):
  File "/usr/lib/python3.11/hmac.py", line 60, in __init__
    self._init_hmac(key, msg, digestmod)
  File "/usr/lib/python3.11/hmac.py", line 67, in _init_hmac
    self._hmac = _hashopenssl.hmac_new(key, msg, digestmod=digestmod)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_hashlib.UnsupportedDigestmodError: [digital envelope routines] unsupported

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/hashlib.py", line 177, in __hash_new
    return _hashlib.new(name, data, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_hashlib.UnsupportedDigestmodError: [digital envelope routines] unsupported

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/tmp/test.py", line 10, in <module>
    shared_list = manager.list()
                  ^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/multiprocessing/managers.py", line 727, in temp
    token, exp = self._create(typeid, *args, **kwds)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/multiprocessing/managers.py", line 607, in _create
    conn = self._Client(self._address, authkey=self._authkey)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/multiprocessing/connection.py", line 525, in Client
    answer_challenge(c, authkey)
  File "/usr/lib/python3.11/multiprocessing/connection.py", line 772, in answer_challenge
    digest = hmac.new(authkey, message, 'md5').digest()
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/hmac.py", line 184, in new
    return HMAC(key, msg, digestmod)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/hmac.py", line 62, in __init__
    self._init_old(key, msg, digestmod)
  File "/usr/lib/python3.11/hmac.py", line 80, in _init_old
    self._outer = digest_cons()
                  ^^^^^^^^^^^^^
  File "/usr/lib/python3.11/hmac.py", line 75, in <lambda>
    digest_cons = lambda d=b'': _hashlib.new(digestmod, d)
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/hashlib.py", line 182, in __hash_new
    return __get_builtin_constructor(name)(data, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/hashlib.py", line 140, in __get_builtin_constructor
    raise ValueError('unsupported hash type ' + name)
ValueError: unsupported hash type md5
6a83619e8b60:/tmp# apk add python-3.11=3.11.14-r3
(1/2) Upgrading python-3.11-base (3.11.14-r2 -> 3.11.14-r3)
(2/2) Upgrading python-3.11 (3.11.14-r2 -> 3.11.14-r3)
Executing busybox-1.37.0-r50.trigger
OK: 69 MiB in 34 packages
6a83619e8b60:/tmp# python3 test.py 
Process 0: Starting
Process 1: Starting
Process 2: Starting
Process 4: Starting
Process 3: Starting

All processes finished.
Final shared list: [0, 1, 2, 4, 3]
List length: 5

javacruft avatar Nov 27 '25 10:11 javacruft

I know I might be a bit of a stick-in-the-mud here, but should we run some reverse-dependency tests across our repositories? It shouldn't regress in theory, but since it is a change to a core component I'd recommend playing it safe.

sil2100 avatar Nov 27 '25 10:11 sil2100

I know I might be a bit of a stick-in-the-mud here, but should we run some reverse-dependency tests across our repositories? It shouldn't regress in theory, but since it is a change to a core component I'd recommend playing it safe.

Great idea - and I need to record an apkregress video as well :)

javacruft avatar Nov 27 '25 11:11 javacruft

Nice! I was just going to copy the 3.12 multiprocessing.py over and go from there. :sweat_smile:

How did you track down the DataDog patch? Is this submitted upstream in case there's a 3.11/3.10 update? Are these questions already answered somewhere? ;-)

Taffer avatar Nov 27 '25 14:11 Taffer

Nice! I was just going to copy the 3.12 multiprocessing.py over and go from there. 😅

How did you track down the DataDog patch? Is this submitted upstream in case there's a 3.11/3.10 update? Are these questions already answered somewhere? ;-)

The DataDog commits are referenced in the Github issue that implemented this feature in 3.12.

javacruft avatar Nov 27 '25 14:11 javacruft

Python 3.11 RDT in Wofli:

Summary

Total packages found: 704
Packages skipped (no YAML): 21
Packages tested: 683
Regressions detected: 9
Hung tests: 0
Successful packages: 665
Failed packages: 9

Packages with regressions:

  • py3-cachetools
  • open-webui
  • py3-contourpy
  • conda
  • py3-dask
  • py3-cppy
  • kubeflow-katib
  • py3-debugpy
  • py3-crashtest
    Error: found 9 regressions

re-ran the regressions - all where auth errors:

Test Results

✅ py3-cppy: PASS (with repo, without-repo test skipped) ✅ kubeflow-katib: PASS (with repo, without-repo test skipped) ✅ py3-cachetools: PASS (with repo, without-repo test skipped) ✅ py3-dask: PASS (with repo, without-repo test skipped) ✅ open-webui: PASS (with repo, without-repo test skipped) ✅ py3-contourpy: PASS (with repo, without-repo test skipped) ✅ py3-debugpy: PASS (with repo, without-repo test skipped) ✅ py3-crashtest: PASS (with repo, without-repo test skipped) ✅ conda: PASS (with repo, without-repo test skipped)

javacruft avatar Nov 27 '25 15:11 javacruft

So they're good now, or they're now producing auth errors? I'm a bit slow this morning! :sweat_smile:

I'd expect the new code to only fail if an older MD5 connection was being tested in FIPS mode, otherwise it should just work either way.

Taffer avatar Nov 27 '25 15:11 Taffer

APK Regression Test Summary

Package: python-3.10
APK Repository: https://apk.cgr.dev/wolfi-presubmit/cbca8e71facf05fa043aaae174b2a1ae7b1e2fde
Test Duration: 1h5m46s

Test Results

Metric Count
Total packages found 680
Packages skipped (no YAML) 14
Packages tested 666
Regressions detected 7
Hung tests 0
Successful packages 650
Failed packages 9

🔴 Packages with Regressions

The following packages fail with the new APK repository but pass without it, indicating potential regressions:

  • py3-dulwich
  • py3-elfdeps
  • py3-faiss-cpu
  • py3-fastavro
  • py3-execnet
  • py3-distributed
  • py3-docutils

Retested single concurrency for regressions:

APK Regression Test Summary

Package: 7 packages from file
APK Repository: https://apk.cgr.dev/wolfi-presubmit/cbca8e71facf05fa043aaae174b2a1ae7b1e2fde
Test Duration: 9m37s

Test Results

Metric Count
Total packages found 7
Packages skipped (no YAML) 0
Packages tested 7
Regressions detected 0
Hung tests 0
Successful packages 7
Failed packages 0

✅ All Tests Passed

No regressions were detected. All packages either passed with the new repository or failed consistently in both scenarios.

javacruft avatar Nov 27 '25 17:11 javacruft

So they're good now, or they're now producing auth errors? I'm a bit slow this morning! 😅

I'd expect the new code to only fail if an older MD5 connection was being tested in FIPS mode, otherwise it should just work either way.

That was a bit confusing - the auth failure was 401's on apk.cgr.dev in my test environment so we're all good now!

javacruft avatar Nov 27 '25 17:11 javacruft

I'd like todo some testing in other repos so maybe early next week to land this change.

javacruft avatar Nov 27 '25 17:11 javacruft

APK Regression Test Summary - enterprise-packages

Package: python-3.10
APK Repository: https://apk.cgr.dev/wolfi-presubmit/cbca8e71facf05fa043aaae174b2a1ae7b1e2fde
Test Duration: 40m11s

Test Results

Metric Count
Total packages found 225
Packages skipped (no YAML) 12
Packages tested 213
Regressions detected 7
Hung tests 1
Successful packages 200
Failed packages 5

🔴 Packages with Regressions

The following packages fail with the new APK repository but pass without it, indicating potential regressions:

  • py3-lark
  • py3-graphviz
  • py3-nvml
  • py3-segments
  • py3-dataclass-wizard
  • py3-csvw
  • py3-humanize

⏰ Tests That Hung

The following tests were killed after 30m0s timeout:

  • py3-peft-cuda-12.9 (without repo)

Reran the regressions - all passed:

APK Regression Test Summary - enterprise-packages

Package: 7 packages from file
APK Repository: https://apk.cgr.dev/wolfi-presubmit/cbca8e71facf05fa043aaae174b2a1ae7b1e2fde
Test Duration: 1m54s

Test Results

Metric Count
Total packages found 7
Packages skipped (no YAML) 0
Packages tested 7
Regressions detected 0
Hung tests 0
Successful packages 7
Failed packages 0

✅ All Tests Passed

javacruft avatar Nov 28 '25 11:11 javacruft

APK Regression Test Summary - enterprise-packages

Package: python-3.11
APK Repository: https://apk.cgr.dev/wolfi-presubmit/cbca8e71facf05fa043aaae174b2a1ae7b1e2fde
Test Duration: 46m28s

Test Results

Metric Count
Total packages found 246
Packages skipped (no YAML) 9
Packages tested 237
Regressions detected 5
Hung tests 3
Successful packages 221
Failed packages 8

🔴 Packages with Regressions

The following packages fail with the new APK repository but pass without it, indicating potential regressions:

  • py3-polib
  • py3-weasyprint
  • py3-dunamai
  • vunnel
  • azure-functions-host

⏰ Tests That Hung

The following tests were killed after 30m0s timeout:

  • py3-accelerate-cuda-12.9 (with repo)
  • py3-sagemaker-huggingface-inference-toolkit (with repo)
  • py3-torchprofile-cuda-12.9 (with repo)

Re-ran wthe five failures:

APK Regression Test Summary

Package: 5 packages from file
APK Repository: https://apk.cgr.dev/wolfi-presubmit/cbca8e71facf05fa043aaae174b2a1ae7b1e2fde
Test Duration: 5m10s

Test Results

Metric Count
Total packages found 5
Packages skipped (no YAML) 0
Packages tested 5
Regressions detected 0
Hung tests 0
Successful packages 5
Failed packages 0

✅ All Tests Passed

No regressions were detected. All packages either passed with the new repository or failed consistently in both scenarios.

javacruft avatar Nov 28 '25 12:11 javacruft