libwebsockets icon indicating copy to clipboard operation
libwebsockets copied to clipboard

cmake: do not allow netlink support when external poll support is enabled.

Open p-luke opened this issue 1 year ago • 4 comments

Netlink fds are not passed to the user callback for external loop adoption, because vhost and its protocols are not yet in place when rops_pt_init_destroy_netlink() gets called during context creation. As a consequence, netlink fds are not polled, leading to outdated/missing routing table entries in the library, expecially when application starts at the same time of network itnerfaces. A complex rework would be required to support this, so for now disable netlink when external poll is needed.

This is the function calls sequence:

  • lws_create_context() calls LWS_ROPS_pt_init_destroy() for every available role (so netlink too, on Linux)
  • rops_pt_init_destroy_netlink() creates the netlink fd, and passes it to lws_wsi_inject_to_loop() , which then passes it to __insert_wsi_socket_into_fds()
  • here, if LWS_WITH_EXTERNAL_POLL is defined, normally the fd is passed to LWS_CALLBACK_ADD_POLL_FD callback - but wsi->a.vhost is not yet inited at this point - so fd is not passed to the application, which gets no chance of adding it to its own poll loop

So, netlink socket ends up being not polled, thus all the library logic on routing table entries is disrupted. For this reason, if application needs LWS_WITH_EXTERNAL_POLL support, unfortunately it is better to disable netlink on Linux, too.

p-luke avatar May 16 '24 13:05 p-luke

Thanks for the explanation... isn't it better to solve this by finally removing EXTERNAL_POLL?

There's support for custom eventlib stuff now that should be better in every way.

Or can't netlink be initialized later when EXTERNAL_POLL can handle it?

lws-team avatar May 16 '24 13:05 lws-team

Or can't netlink be initialized later when EXTERNAL_POLL can handle it?

I was trying to analyze lws_create_context() execution flow to see how to achieve this, but I am not yet there... Netlink stuff, being a role, is inited along all other roles way before wsi->a.vhost is in place, and I do not know if moving all roles init after that can break something else (maybe there is code in between which requires roles to be inited...?), I have not yet analyzed all the context creation related code.

p-luke avatar May 20 '24 08:05 p-luke

Maybe it's enough if the Netlink role sees that you're using EXTERNAL_POLL and defers initializing it until you do it manually after the start of EXTERNAL_POLL

lws-team avatar May 20 '24 09:05 lws-team

I am not sure on how to achieve this deferred initialization. Netlink role gets called via lws_rops_t callbacks, and the socket creation is performed in pt_init_destroy. The role has not other callbacks set (apart from handle_POLLIN), so how could I manually call netlink to create the socket?

p-luke avatar May 29 '24 11:05 p-luke