Decoder isn't Threadsafe.
While looking into making a new cython library as explained in a pervious pull request I found out that the current pure python implementation is not threadsafe due to the use of global variables and I was wondering if contextvars or threading.local varaiables could be given a try to try and remedy the problem with the decoding tables or if a class object could be looked into to make the object threadsafe?
Would you mind pointing out the global variables? That should be refactored indeed! My quick code search didn't yield any results for global. Which variables are you referring to?
As to thread-safety: this is by design and up to the consumer of the hpack library. You get an Encoder and Decoder. If you want to use them from different threads, it is up to you to ensure thread safety.
Would you mind pointing out the global variables? That should be refactored indeed! My quick code search didn't yield any results for
global. Which variables are you referring to?As to thread-safety: this is by design and up to the consumer of the hpack library. You get an
EncoderandDecoder. If you want to use them from different threads, it is up to you to ensure thread safety.
Yeah it's the large decoder table actually. A threading.local variable would mitigate the costliness of it.
@Vizonex are you referring to this table? https://github.com/python-hyper/hpack/blob/056200090577c72978257476e8776afab88595a8/src/hpack/huffman_table.py#L133
This should only ever be accessed in a read-only way - consider it a constant value that will not be modified during encoding or decoding. What costliness are you considering here? Happy to review a PR and benchmark test + results.
@Vizonex are you referring to this table?
hpack/src/hpack/huffman_table.py
Line 133 in 0562000
HUFFMAN_TABLE = [ This should only ever be accessed in a read-only way - consider it a constant value that will not be modified during encoding or decoding. What costliness are you considering here? Happy to review a PR and benchmark test + results.
That's the one I am looking at trying to tweak. I'll probably try writing up something when I get off work today.
Interesting to see your results! My understanding is that constant variables (variables whos values are not changing) and accessing them in a read-only fashion from different threads should be possible without issues. Please report any errors, exceptions, data race conditions, or performance issues you find.
In this case, the HUFFMAN_TABLE is a Python list. The only two uses of this variable are in a simple index-based item access: state, flags, output_byte = HUFFMAN_TABLE[index]. This should be thread-safe by default when using a GIL-based Python version.
@Kriechi There's a PR If you want to look at it, I am planning to implement a few optimizations to trim out some larger portions of memory incase needed