mongodb-erlang icon indicating copy to clipboard operation
mongodb-erlang copied to clipboard

No reconnection for mongodb_topology_pool

Open johnzeng opened this issue 7 years ago • 5 comments

I use mongodb_topology_pool to pool my mongodb's request, but I found out that if the mongodb is shutdown, the monogo_topology_pool will have no ability to reconnect and so I have to restart the whole application.

  • run the application with mongodb sdk
  • stop the monogdb
  • requests will stop in mongodb operation
  • restart mongodb, no reconnection.

I have tried to figure out what's wrong and looks like the mc_pool_sup is always down after the workers get some connection error.

johnzeng avatar Jan 22 '17 06:01 johnzeng

Sounds like a bug.

comtihon avatar Jan 22 '17 14:01 comtihon

I think the problem is that the supervisor is trying too fast to reconnect before the mongodb is restarted, it retried over the intensity during the period so the supervisor itself is terminated

johnzeng avatar Jan 22 '17 15:01 johnzeng

Yes. I noticed that too, when testing this issue. I am now thinking about refactoring the mongoc module, as it is very hard for me to dive into this code. After the refactoring I think this error will be solved. But if it is critical for you - you can make a pr to master with old code.

comtihon avatar Jan 22 '17 17:01 comtihon

No it's not critical, I happened to find this when I was switching my mongodb server. It won't happen frequently. I can wait for refactoring and I think that will be better.

johnzeng avatar Jan 23 '17 02:01 johnzeng

the only thing that starts mc_pool_sup is mongoc:connect/3, which ends up doing mc_pool_sup:start_link/0 in the calling process; so if you happen to call mongoc:connect/3 in a process that ignores it, like i did (e.g. a supervisor, which always ignores 'EXIT' from non-children), then nothing will restart it. there are a few possible solutions i can think of:

  1. return the mc_pool_sup pid from mongoc:connect/3 if it had to be started, but then this would change its return type;
  2. remove the call to mc_pool_sup:ensure_started/0 from mongoc:connect/3, and add to documentation that you must start mc_pool_sup before using mongoc;
  3. ditto, but just start mc_pool_sup from mc_super_sup;
  4. use supervisor:start_child(mc_super_sup, ...) in mc_pool_sup:ensure_started/0 instead of calling start_link/0 directly.

jessa0 avatar Mar 10 '17 20:03 jessa0