laboratory
laboratory copied to clipboard
Run control and candidate in parallel or candidate async
When setting up an experiment, we often want to ensure that the user isn't adversely affected. In order to avoid possible slow downs in the candidate code from affecting the user, would it be possible to run the control and the candidate in parallel? Or better yet, run the candidate completely asynchronously?
I have spent a lot of time writing JavaScript recently so please forgive my lack of ideas on how to solve this problem in Python. Thanks for porting this to Python by the way!
True concurrency is not really doable in python because of the global interpreter lock. It could spawn a subprocess and run there but I don't fully know the implications of doing so, and it seems like it could be an undesirable side effect for some. Something to look into, but my gut feeling is not good.
Running it async could also affect the results of the experiment if the code accesses a shared resource like a database or cache that has now changed, so that's not really doable either.
In an ideal world it'd work this way but I think users of the library will have to accept correctness over performance whilst they run the experiment.
You can provide an async=True argument that will indicate that the code will run in a thread. If your application is already threaded and the control and/or candidate perform I/O or release the GIL that would help performance.
If it's going down this path, what do we mean by "async"? Parallelism or concurrency?
Concurrency. We don't want to use processes in 99% of the cases.
yeah i also agree that running it async could also affect the results of the experiment if the code accesses a shared resource like a database or cache that has now changed, so that's not really doable either. but i don't think that is the solution here we should not use concurrency at all here. If you want to increase the perormance you can provide an async=True argument that will indicate that the code will run in a thread. If your application is already threaded and the control and/or candidate perform I/O or release the GIL that would help performance.
If you have a task queue, you may be able to run the candidate on a worker, passing (or pulling from cache etc) the user/param context to it. You can also give an expiration to ensure the backlog doesn't grow unbounded.
As far as recording purposes, that does pose some challenges, though still doable, but would be extra custom code.