Beam VM becomes stucked when the number of connections is high
I have an application that connects to multiple MQTT servers, each running locally in its network namespaces. When the number of connection is too high the beam vm becomes stuck. I have written an example project by extracting code from our codebase and wrote a README to reproduce the problem. You can find it here. I would be happy if you can find the time to try it and tell me if you can reproduce the problem.
I am using Debian 9 (stretch), Elixir 1.8.2 and Erlang/OTP 21.3.8.8.
And thanks for writing this software :-).
P.S: I was once lucky enough to have the observer print a few more graphs before being completely overloaded:

This is surprising because the rate of creation for the network namespace and start of the mosquitto servers is approx one per second.
I am currently in the process of a major rewrite (admittedly it has been going on for a long while), which will bring MQTT 5 support to Tortoise. I hace recently picked up development again, and wrapping my head around what is needed to make it a release candidate, but the architecture differ, so I hope you will let me release that, and then loom at this issue ?
And thanks for using Tortoise; spawning 300 tortoises on a single node is not a use-case I anticipated :)
It's great to hear that you are planning to further develop Tortoise! Maybe the problem will go away after the rewrite?
Thanks, I will keep an eye on the project development.
And thanks for using Tortoise; spawning 300 tortoises on a single node is not a use-case I anticipated :)
It sounds a big unusual but that's what is need to simulate IoT devices for my team.
Forgot this information: most of the schedulers processes states were in the same calls when generated a dump (from the original problem, I did not generate a dump of the example):
Current Process CP: 0x00007f26a080db08 ('Elixir.Registry':unregister_match/4 + 952)
Current Process Limited Stack Trace:
0x00007f260d5c9348:SReturn addr 0x15A873D8 ('Elixir.Tortoise.Events':unregister/2 + 152)
0x00007f260d5c93a0:SReturn addr 0x15A79DF0 ('Elixir.Tortoise.Connection':connection/2 + 952)
0x00007f260d5c93a8:SReturn addr 0x15A8FCD8 ('Elixir.Tortoise':publish/4 + 384)
[...]
Oh; I have a registry I use as a pubsub, such that processes can subscribe to a tcp socket—I move the tcp socket to the process that will send a QoS=0 message. Could be because the registry gets overwhelmed when too many tortoises are running.
…an interesting case. I will look further into this at a later time.
Maybe related to https://github.com/pallix/veth_network_namespaces_perf