aerospike-client-python
aerospike-client-python copied to clipboard
Aerospike query stuck after fork
Hi,
in forked server everything works nice when using shm. Everything except query.
Immediately after query call (query.results()) the process is stuck. This happens also with query.foreach()
simple example: https://gist.github.com/tivvit/7e7448742a8b017326dd
Since (I think) the client creates threads, forking after creating it is not safe. You probably want to create your client object(s) after you fork, in the child. (Not sure it's the cause of the problem you noticed, but FWIW.)
I am not sure if this is useful or even related to this, but RonRothman helped me solve a problem. I was trying to run a celery task that involved an aerospike call. It hung every time I tried to use query.results() or query.foreach(). However, the same function worked properly when run synchronously from an interactive session. After reading this, I moved the client creation from the top of the file (right after the imports) and down inside the actual function that was being executed, which was enough to get this running. I am just doing a proof of concept at this point, if it works out I am sure I will have to resolve this more completely and will report back.
Thanks for all the info and the sample code. Yes, the client creates threads, and in this case the C client hangs waiting for the thread to complete. I'm still looking into it, but in the meantime, is it possible for you to use threads instead of os.fork() ?
An example here: https://gist.github.com/jboone100/e1f88ac510f57d7cc46e
I'll continue to investigate this issue.
Another option is to move the client.connect() call to after the fork().
That is actually how I did it. Not using os.fork() is unfortunately not an option- in my very limited world both uWSGI and celery workers execute a fork() as part of what they do. Moving the client.connect() call does resolve the issue, but can be hard to troubleshoot when some tasks just seem to hang :)
OK. I found out what is happening and this is not a Python specific issue. It has to do with forking and the C client's worker thread pool.
client.connect() : C client creates a pool of worker threads.
fork() : Process is created but worker threads are not copied to child process.
query.foreach() : C client hangs waiting for a non-existent thread to complete some work
This also explains why the foreach() will succeed if you allow the parent process to do the work. So you will need to move the connect() to after the fork() for this to work correctly.
I'll look into uWSGI and celery and work up some examples.