idna
idna copied to clipboard
The library is not thread-safe
self_sign_pem_chain = await get_event_loop().run_in_executor(
File "/usr/lib64/python3.9/concurrent/futures/thread.py", line 52, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/lib/python3.9/site-packages/ideco_crypto/cert.py", line 261, in get_selfgenerated_chain
common_name = _normalize_domain(common_name, allow_underscore=allow_underscore)
File "/usr/lib/python3.9/site-packages/ideco_crypto/cert.py", line 76, in _normalize_domain
return prefix + idna.encode(domain, uts46=True).lower().decode('ascii')
File "/usr/lib/python3.9/site-packages/idna/core.py", line 349, in encode
s = uts46_remap(s, std3_rules, transitional)
File "/usr/lib/python3.9/site-packages/idna/core.py", line 318, in uts46_remap
from .uts46data import uts46data
ImportError: cannot import name 'uts46data' from partially initialized module 'idna.uts46data' (most likely due to a circular import) (/usr/lib/python3.9/site-packages/idna/uts46data.py)
This happens beacause uts46_remap()
imports from .uts46data import uts46data
inside. When this happens in two threads in parallel, it leads to error above. Really hard to trigger, but it happens sometimes.
I imagine you can fix this for your application by calling idna.encode()
once before the multiple threads are started, so the import will have already happened.
Yes, I fixed exactly in this way. But the bug exists in the library anyway.
This bug has been open a while and I am keen to fix it. Before I roll up my sleeves and do some research on the best approach, I'd invite anyone who is more familiar with an appropriate idiomatic approach to contribute a patch to resolve it. My aim would be to preserve lazy loading the UTS46 data until needed, if possible.
There must be something non-obvious going on here, or a bug in Python itself or in libc on @socketpair's machine, because Python already uses a lock around imports - originally global, preventing all concurrent imports, and since 3.3 on a per-module basis:
In Python 3.3, importing a module takes a per-module lock. This correctly serializes importation of a given module from multiple threads (preventing the exposure of incompletely initialized modules)
https://docs.python.org/3.3/whatsnew/3.3.html#a-finer-grained-import-lock
So the original bug report "cannot happen".
If it is a bug in Python then it could be worked around by adding a manual lock around the import
presumably...
_import_lock = threading.Lock()
def uts46_remap(domain: str, std3_rules: bool = True, transitional: bool = False) -> str:
"""Re-map the characters in the string according to UTS46 processing."""
with _import_lock:
from .uts46data import uts46data
output = ''
...
I had Python 3.9, as you can see. Possibly it's a bug. But I don't know where. Can you reproduce ? it's possible to suspend importing by inserting time.sleep() at the root scope of a file.
I've not been able to reproduce this issue, so for now I am going to close this issue. If it can be reproduced happy to reopen.