server-client-python icon indicating copy to clipboard operation
server-client-python copied to clipboard

[Type 3] Audit thread-safety or other concurrency

Open septatrix opened this issue 3 years ago • 1 comments

Summary

Currently it is not very clear if this library is thread-safe or not. According to https://github.com/psf/requests/issues/2766 a single session per thread is preferred so should we create multiple Server instances for each thread?

Alternatively I would be very interested in an asyncio Version of this library using httpx or a similar library. This is generally the direction in which Python is now also moving and a lot easier to reason about than multithreaded concurrency. Furthermore it would lead to better code style as all the hidden IO which is currently performed would then be eliminated.

septatrix avatar Nov 29 '22 22:11 septatrix

Ran into a similar issue, and as a result, I am quite sure the library is not thread safe.

These kinds of calls were causing race conditions for me (and as a result, giving me PDFs with the wrong filenames). I decided to use a semaphore for these calls in the meantime.

To reproduce, I was using this code with ThreadPoolExecutor, and specifying different filters (via pdf_request_options) for the same view. E.g.,

server.views.populate_pdf(view, pdf_request_options)
return view.pdf

You can also see the problem in a different way with the latest version of the library and the logging module, e.g., I can get multiple such calls in a row because this blocking state is not shared across threads.

2023-04-25 21:12:20,725 [DEBUG] None
None
2023-04-25 21:12:20,726 [DEBUG] None
None
2023-04-25 21:12:20,727 [DEBUG] None
[21:12:20] Async request failed: retrying
2023-04-25 21:12:20,727 [DEBUG] [21:12:20] Async request failed: retrying
[21:12:20] Async request failed: retrying
2023-04-25 21:12:20,727 [DEBUG] [21:12:20] Async request failed: retrying
[21:12:20] Async request failed: retrying
2023-04-25 21:12:20,727 [DEBUG] [21:12:20] Async request failed: retrying
[21:12:20] Begin blocking request to https://tableauserver.com/api/3.14/sites/xxxxx-site-1/views/xxxxx-view-2/pdf
2023-04-25 21:12:20,727 [DEBUG] [21:12:20] Begin blocking request to https://tableauserver.com/api/3.14/sites/xxxxx-site-1/views/xxxxx-view-2/pdf
[21:12:20] Begin blocking request to https://tableauserver.com/api/3.14/sites/xxxxx-site-1/views/xxxxx-view-2/pdf
2023-04-25 21:12:20,728 [DEBUG] [21:12:20] Begin blocking request to https://tableauserver.com/api/3.14/sites/xxxxx-site-1/views/xxxxx-view-2/pdf
[21:12:20] Begin blocking request to https://tableauserver.com/api/3.14/sites/xxxxx-site-1/views/xxxxx-view-2/pdf
2023-04-25 21:12:20,728 [DEBUG] [21:12:20] Begin blocking request to https://tableauserver.com/api/3.14/sites/xxxxx-site-1/views/xxxxx-view-2/pdf
[21:12:21] Call finished
2023-04-25 21:12:21,221 [DEBUG] [21:12:21] Call finished
[21:12:21] Request complete
2023-04-25 21:12:21,221 [DEBUG] [21:12:21] Request complete
[21:12:21] Call finished
2023-04-25 21:12:21,230 [DEBUG] [21:12:21] Call finished
[21:12:21] Request complete
2023-04-25 21:12:21,230 [DEBUG] [21:12:21] Request complete
Response status: <Response [200]>
2023-04-25 21:12:21,230 [DEBUG] Response status: <Response [200]>

This problem appears to extend to the Endpoint class itself, as rewriting _blocking_request and send_request_while_show_progress_threaded alone did not fix the problem.

https://github.com/tableau/server-client-python/blob/307d8a20a30f32c1ce615cca7c6a78b9b9bff081/tableauserverclient/server/endpoint/endpoint.py#L79-L95

dxdc avatar Apr 26 '23 04:04 dxdc