cryptography icon indicating copy to clipboard operation
cryptography copied to clipboard

Support free-threaded CPython (3.13t)

Open ngoldbaum opened this issue 9 months ago • 10 comments

We have a WIP version of cffi that builds on the free-threaded interpreter: https://github.com/Quansight-Labs/cffi/pull/1

I think with that it should now be possible to make some headway on free-threaded support without too much hacking to disable things.

Unfortunately we're still blocked on releasing cp313t wheels to users until CFFI support is merged, unless there's any appetite to depend on our fork. There isn't a way to depend on the fork only on the free-threaded build because environment markers don't know about ABI tags. We're hoping to have a PEP and a fix for that in Python 3.14 but that doesn't really help much right now or in the next few years.

ngoldbaum avatar Feb 19 '25 22:02 ngoldbaum

At the current juncture, we don't have an appetite to depend on a cffi fork.

We're more interested in burning down our usage of cffi (which at this point exists only for pyopenssl).

alex avatar Feb 19 '25 22:02 alex

It looks like the existing nox test sessions all pass on the free-threaded build, at least locally on my mac, after applying the following patch:

diff --git a/pyproject.toml b/pyproject.toml
index 02d58b00a..ed0ba51d9 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -5,7 +5,7 @@ requires = [
     "maturin>=1,<2",

     # Must be kept in sync with `project.dependencies`
-    "cffi>=1.12; platform_python_implementation != 'PyPy'",
+    "cffi @ git+https://github.com/quansight-labs/cffi",
     # Used by cffi (which import distutils, and in Python 3.12, distutils has
     # been removed from the stdlib, but installing setuptools puts it back) as
     # well as our build.rs for the rust/cffi bridge.
diff --git a/src/rust/cryptography-cffi/build.rs b/src/rust/cryptography-cffi/build.rs
index 1243a8187..2ddb455f4 100644
--- a/src/rust/cryptography-cffi/build.rs
+++ b/src/rust/cryptography-cffi/build.rs
@@ -78,8 +78,15 @@ fn main() {
         build.include(python_include);
     }

-    // Enable abi3 mode if we're not using PyPy.
-    if python_impl != "PyPy" {
+    let is_free_threaded = run_python_script(
+        &python,
+        "import sysconfig; print(bool(sysconfig.get_config_var('Py_GIL_DISABLED')), end='')",
+    )
+    .unwrap()
+        == "True";
+
+    // Enable abi3 mode if we're not using PyPy or the free-threaded build
+    if !(python_impl == "PyPy" || is_free_threaded) {
         // cp37 (Python 3.7 to help our grep when we some day drop 3.7 support)
         build.define("Py_LIMITED_API", "0x030700f0");
     }

That said, I don't see any explicitly multithreaded tests in the existing test suite. The tests run using pytest-xdist, but that's processed-based parallelism. It looks like there's some code to control multithreaded parallelism inside OpenSSL, but I don't see any other uses of std::thread in Rust code.

There are some unsafe impl Send and Sync uses:

src/rust/cryptography-openssl/src/hmac.rs
21:unsafe impl Sync for Hmac {}
23:unsafe impl Send for Hmac {}

src/rust/cryptography-openssl/src/aead.rs
25:unsafe impl Sync for AeadCtx {}
27:unsafe impl Send for AeadCtx {}

src/rust/cryptography-openssl/src/cmac.rs
22:unsafe impl Sync for Cmac {}
24:unsafe impl Send for Cmac {}

These are all wrappers for C types in OpenSSL. I don't know offhand if the GIL was somehow providing the guarantees backing these implementations.

There is some use of static, but they're all either const or one-time initialized GILOnceCell instances or types backed by GILOnceCell. I don't see any mutable statics.

I see here that I should expect immutable objects to be thread-safe by construction but that currently the only guarantees for objects with state are that they should not be concurrently used between threads.

Rather than adding locking, which might introduce scaling problems, one thing to consider is adding an atomic flag that a thread sets when it is actively using an object with mutable state. If another thread tries to use the object and sees the flag set, then you could generate a runtime error indicating the user tried to something unsupported. It might also just be necessary to document more fully that state should not be shared between threads, e.g. #12488.

It's not clear to me to what extent it's possible to trigger undefined behavior in the GIL-enabled build right now with the threading module if you do share stateful objects between threads.

I'm curious if the maintainers have an opinion about what they would like to see here in terms of testing and adding runtime guardrails or locking to improve thread safety.

ngoldbaum avatar Feb 24 '25 18:02 ngoldbaum

I don't believe any locks are required. All of the places where we have mutable data (e.g., Hash, CipherContext) only mutate with &mut references, which pyo3 ensures are exclusive.

There's no coherent behavior for two threads to update any of these objects concurrently without their own synchronization, so its fine IMO that those simply raise.

On Mon, Feb 24, 2025 at 1:16 PM Nathan Goldbaum @.***> wrote:

It looks like the existing nox test sessions all pass on the free-threaded build, at least locally on my mac, after applying the following patch:

diff --git a/pyproject.toml b/pyproject.toml index 02d58b00a..ed0ba51d9 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -5,7 +5,7 @@ requires = [ "maturin>=1,<2",

 # Must be kept in sync with `project.dependencies`
  • "cffi>=1.12; platform_python_implementation != 'PyPy'",
  • "cffi @ git+https://github.com/quansight-labs/cffi",

    Used by cffi (which import distutils, and in Python 3.12, distutils has

    been removed from the stdlib, but installing setuptools puts it back) as

    well as our build.rs for the rust/cffi bridge.

diff --git a/src/rust/cryptography-cffi/build.rs b/src/rust/cryptography-cffi/build.rs index 1243a8187..2ddb455f4 100644 --- a/src/rust/cryptography-cffi/build.rs +++ b/src/rust/cryptography-cffi/build.rs @@ -78,8 +78,15 @@ fn main() { build.include(python_include); }

  • // Enable abi3 mode if we're not using PyPy.
  • if python_impl != "PyPy" {
  • let is_free_threaded = run_python_script(
  •    &python,
    
  •    "import sysconfig; print(bool(sysconfig.get_config_var('Py_GIL_DISABLED')), end='')",
    
  • )
  • .unwrap()
  •    == "True";
    
  • // Enable abi3 mode if we're not using PyPy or the free-threaded build
  • if !(python_impl == "PyPy" || is_free_threaded) { // cp37 (Python 3.7 to help our grep when we some day drop 3.7 support) build.define("Py_LIMITED_API", "0x030700f0"); }

That said, I don't see any explicitly multithreaded tests in the existing test suite. The tests run using pytest-xdist, but that's processed-based parallelism. It looks like there's some code to control multithreaded parallelism inside OpenSSL, but I don't see any other uses of std::thread in Rust code.

There are some unsafe impl Send and Sync uses:

src/rust/cryptography-openssl/src/hmac.rs 21:unsafe impl Sync for Hmac {} 23:unsafe impl Send for Hmac {}

src/rust/cryptography-openssl/src/aead.rs 25:unsafe impl Sync for AeadCtx {} 27:unsafe impl Send for AeadCtx {}

src/rust/cryptography-openssl/src/cmac.rs 22:unsafe impl Sync for Cmac {} 24:unsafe impl Send for Cmac {}

These are all wrappers for C types in OpenSSL. I don't know offhand if the GIL was somehow providing the guarantees backing these implementations.

There is some use of static, but they're all either const or one-time initialized GILOnceCell instances or types backed by GILOnceCell. I don't see any mutable statics.

I see here https://github.com/pyca/cryptography/issues/12488#issuecomment-2669714990 that I should expect immutable objects to be thread-safe by construction but that currently the only guarantees for objects with state are that they should not be concurrently used between threads.

Rather than adding locking, which might introduce scaling problems, one thing to consider is adding an atomic flag that a thread sets when it is actively using an object with mutable state. If another thread tries to use the object and sees the flag set, then you could generate a runtime error indicating the user tried to something unsupported. It might also just be necessary to document more fully that state should not be shared between threads, e.g. #12488 https://github.com/pyca/cryptography/issues/12488.

It's not clear to me to what extent it's possible to trigger undefined behavior in the GIL-enabled build right now with the threading module if you do share stateful objects between threads.

I'm curious if the maintainers have an opinion about what they would like to see here in terms of testing and adding runtime guardrails or locking to improve thread safety.

— Reply to this email directly, view it on GitHub https://github.com/pyca/cryptography/issues/12489#issuecomment-2679288182, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAAGBGE36EVNONJMUTEFKL2RNOYLAVCNFSM6AAAAABXPKF7YOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMNZZGI4DQMJYGI . You are receiving this because you commented.Message ID: @.***> [image: ngoldbaum]ngoldbaum left a comment (pyca/cryptography#12489) https://github.com/pyca/cryptography/issues/12489#issuecomment-2679288182

It looks like the existing nox test sessions all pass on the free-threaded build, at least locally on my mac, after applying the following patch:

diff --git a/pyproject.toml b/pyproject.toml index 02d58b00a..ed0ba51d9 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -5,7 +5,7 @@ requires = [ "maturin>=1,<2",

 # Must be kept in sync with `project.dependencies`
  • "cffi>=1.12; platform_python_implementation != 'PyPy'",
  • "cffi @ git+https://github.com/quansight-labs/cffi",

    Used by cffi (which import distutils, and in Python 3.12, distutils has

    been removed from the stdlib, but installing setuptools puts it back) as

    well as our build.rs for the rust/cffi bridge.

diff --git a/src/rust/cryptography-cffi/build.rs b/src/rust/cryptography-cffi/build.rs index 1243a8187..2ddb455f4 100644 --- a/src/rust/cryptography-cffi/build.rs +++ b/src/rust/cryptography-cffi/build.rs @@ -78,8 +78,15 @@ fn main() { build.include(python_include); }

  • // Enable abi3 mode if we're not using PyPy.
  • if python_impl != "PyPy" {
  • let is_free_threaded = run_python_script(
  •    &python,
    
  •    "import sysconfig; print(bool(sysconfig.get_config_var('Py_GIL_DISABLED')), end='')",
    
  • )
  • .unwrap()
  •    == "True";
    
  • // Enable abi3 mode if we're not using PyPy or the free-threaded build
  • if !(python_impl == "PyPy" || is_free_threaded) { // cp37 (Python 3.7 to help our grep when we some day drop 3.7 support) build.define("Py_LIMITED_API", "0x030700f0"); }

That said, I don't see any explicitly multithreaded tests in the existing test suite. The tests run using pytest-xdist, but that's processed-based parallelism. It looks like there's some code to control multithreaded parallelism inside OpenSSL, but I don't see any other uses of std::thread in Rust code.

There are some unsafe impl Send and Sync uses:

src/rust/cryptography-openssl/src/hmac.rs 21:unsafe impl Sync for Hmac {} 23:unsafe impl Send for Hmac {}

src/rust/cryptography-openssl/src/aead.rs 25:unsafe impl Sync for AeadCtx {} 27:unsafe impl Send for AeadCtx {}

src/rust/cryptography-openssl/src/cmac.rs 22:unsafe impl Sync for Cmac {} 24:unsafe impl Send for Cmac {}

These are all wrappers for C types in OpenSSL. I don't know offhand if the GIL was somehow providing the guarantees backing these implementations.

There is some use of static, but they're all either const or one-time initialized GILOnceCell instances or types backed by GILOnceCell. I don't see any mutable statics.

I see here https://github.com/pyca/cryptography/issues/12488#issuecomment-2669714990 that I should expect immutable objects to be thread-safe by construction but that currently the only guarantees for objects with state are that they should not be concurrently used between threads.

Rather than adding locking, which might introduce scaling problems, one thing to consider is adding an atomic flag that a thread sets when it is actively using an object with mutable state. If another thread tries to use the object and sees the flag set, then you could generate a runtime error indicating the user tried to something unsupported. It might also just be necessary to document more fully that state should not be shared between threads, e.g. #12488 https://github.com/pyca/cryptography/issues/12488.

It's not clear to me to what extent it's possible to trigger undefined behavior in the GIL-enabled build right now with the threading module if you do share stateful objects between threads.

I'm curious if the maintainers have an opinion about what they would like to see here in terms of testing and adding runtime guardrails or locking to improve thread safety.

— Reply to this email directly, view it on GitHub https://github.com/pyca/cryptography/issues/12489#issuecomment-2679288182, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAAGBGE36EVNONJMUTEFKL2RNOYLAVCNFSM6AAAAABXPKF7YOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMNZZGI4DQMJYGI . You are receiving this because you commented.Message ID: @.***>

-- All that is necessary for evil to succeed is for good people to do nothing.

alex avatar Feb 24 '25 18:02 alex

so its fine IMO that those simply raise

Cool, I will take a look to verify this happens right now via PyO3's runtime borrow checking by adding some explicitly multithreaded tests of stateful primitives.

It looks like the python APIs that provide stateful objects with an update method are CipherContext, HashContext, and PaddingContext, so any concurrent shared use of those with python's threading module should be an error. Are there any other stateful primitives exposed to Python I'm missing?

For the rest of the API that we expect to be thread-safe by construction, how far do you want me to go adding tests to verify that? Are there any high-level integration tests I could leverage to modify into a multithreaded test case along the lines of what was added to bcrypt when we added free-threaded support?

Thanks so much for your help and advice 😀.

ngoldbaum avatar Feb 24 '25 18:02 ngoldbaum

I don't feel any particular need to have tests for all the immutable objects.

On Mon, Feb 24, 2025 at 1:52 PM Nathan Goldbaum @.***> wrote:

so its fine IMO that those simply raise

Cool, I will take a look to verify this happens right now via PyO3's runtime borrow checking by adding some explicitly multithreaded tests of stateful primitives.

It looks like the python APIs that provide stateful objects with an update method are CipherContext, HashContext, and PaddingContext, so any concurrent shared use of those with python's threading module should be an error. Are there any other stateful primitives exposed to Python I'm missing?

For the rest of the API that we expect to be thread-safe by construction, how far do you want me to go adding tests to verify that? Are there any high-level integration tests I could leverage to modify into a multithreaded test case along the lines of what was added to bcrypt when we added free-threaded support?

Thanks so much for your help and advice 😀.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

ngoldbaum left a comment (pyca/cryptography#12489)

so its fine IMO that those simply raise

Cool, I will take a look to verify this happens right now via PyO3's runtime borrow checking by adding some explicitly multithreaded tests of stateful primitives.

It looks like the python APIs that provide stateful objects with an update method are CipherContext, HashContext, and PaddingContext, so any concurrent shared use of those with python's threading module should be an error. Are there any other stateful primitives exposed to Python I'm missing?

For the rest of the API that we expect to be thread-safe by construction, how far do you want me to go adding tests to verify that? Are there any high-level integration tests I could leverage to modify into a multithreaded test case along the lines of what was added to bcrypt when we added free-threaded support?

Thanks so much for your help and advice 😀.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

-- All that is necessary for evil to succeed is for good people to do nothing.

alex avatar Feb 24 '25 18:02 alex

Would you be OK with setting up a CI job that installs a fork of CFFI with @colesbury's PR applied to enable running tests on the free-threaded build? I think we can leave the top-level pyproject.toml alone and override the cffi dependency for that one CI job. Without depending on the fork there's a build error that I don't see how to work around.

That will unblock running free-threaded Python in CI and allow me to add and test some code paths that will only trigger on the free-threaded build. The free-threaded-specific code paths are all to handle runtime borrow-checker errors, along with some asserts that the exceptions only happen on the free-threaded build.

So far I've written tests for hashing and padding and I still need to look at unpadding and ciphers. It looks like I'll also need to move some code from Python to Rust before I can finish off padding and there will likely be more code that needs to be ported to Rust elsewhere.

If a CI job is OK, would you be OK with me setting that up right now, along with the hash digest tests I've already written?

Thanks for your help!

ngoldbaum avatar Mar 04 '25 18:03 ngoldbaum

I’d be fine with that in the context of the draft PR so you can make forward progress.

reaperhulk avatar Mar 04 '25 19:03 reaperhulk

I opened a draft PR with what I have so far: #12555.

ngoldbaum avatar Mar 04 '25 23:03 ngoldbaum

Linking this comment on an older PR, which explains what needs to happen in pyopenssl to drop the cffi dependency in cryptography: https://github.com/pyca/cryptography/pull/11954#issuecomment-2480196329

I also looked inside cryptography and the cffi bindings are used mostly in tests:

goldbaum at Nathans-MBP in ~/Documents/cryptography on pad-unpad-thread-safety
± rg "\._lib\."
tests/hazmat/backends/test_openssl.py
87:            backend._lib.ERR_put_error(
88:                backend._lib.ERR_LIB_EVP, 0, 0, b"test_openssl.py", -1
91:        assert backend._lib.ERR_peek_error() != 0
95:        assert backend._lib.ERR_peek_error() == 0
99:        meth = backend._lib.TLS_method()
100:        ctx = backend._lib.SSL_CTX_new(meth)
102:        backend._lib.SSL_CTX_free(ctx)
105:        cipher = backend._lib.EVP_get_cipherbyname(b"aes-256-cbc")
236:    backend._lib.Cryptography_HAS_EVP_PKEY_DHX == 1,

tests/hazmat/primitives/test_rsa.py
1744:            and not backend._lib.Cryptography_HAS_IMPLICIT_RSA_REJECTION

tests/wycheproof/test_rsa.py
272:        if backend._lib.Cryptography_HAS_IMPLICIT_RSA_REJECTION:

src/cryptography/hazmat/backends/openssl/backend.py
242:        return self._lib.Cryptography_HAS_EVP_PKEY_DHX == 1

goldbaum at Nathans-MBP in ~/Documents/cryptography
± rg "\._ffi\."
tests/hazmat/backends/test_openssl.py
101:        assert ctx != backend._ffi.NULL
106:        assert cipher != backend._ffi.NULL

ngoldbaum avatar Mar 26 '25 19:03 ngoldbaum

I looked at multithreaded stress tests of CipherContext today and managed to come up with this: https://gist.github.com/ngoldbaum/73c75fc4615531578dacd833fa21c529

I ended up using ECB mode, so the order of inputs and outputs don't matter. Maybe there's a clever way to test the other modes without adding locking in the test using e.g. update_into but I couldn't think of it.

In any case, like the other multithreaded stress tests, this is mostly a test of pyo3's internal bookeeping being sound. I just wanted to go through the exercise to see if anything unexpected shook out.

ngoldbaum avatar Mar 27 '25 20:03 ngoldbaum

It looks like the CFFI 2.0 beta release should unblock getting a cryptography cp314t wheel shipped: https://github.com/pyca/cryptography/pull/12555#issuecomment-3133404901.

Unfortunately there are fundamental issues in CFFI on cp313t due to differences in how critical sections work, so cryptography probably will never officially support 3.13t.

ngoldbaum avatar Jul 29 '25 17:07 ngoldbaum

This can probably be unpinned now?

ngoldbaum avatar Sep 17 '25 19:09 ngoldbaum

Good call! Thanks

alex avatar Sep 17 '25 19:09 alex