wget2
wget2 copied to clipboard
Eliminate all SOCKET_TO_FD usage in wolfcrypt, more error checking around invalid sockets
OK between this and the poll fixes in gnulib I am unable to recreate any deadlocks or faults even with a dozen+ transfers with concurrency.
The big thing here is removing all the calls to the SOCKET_TO_FD macro. While FD_TO_SOCKET is totally safe, the inverse is not. Every time it is called a new fd is created. Note this was modeled similarly to: https://github.com/rockdaboot/wget2/blob/14d0460597ab98a519af5396d91bb430256023e6/libwget/ssl_openssl.c#L1785 which almost certainly needs to also be changed due to the same issue.
To avoid using that we need to be keeping a reference to the FD and the socket (well technically only the FD but might as well track both as right now we track the socket instead anyway).
Doing this the cleanest way I figured was to extend the wolfssl session object to a new wolf_extended_session struct that stores the FD/socket references as well. Keeping a lookup table or storing it in the global session info (and adding another accessor method to get it off the session) seemed worse.
Note there is a good bit of commented code in ssl_wolfssl.c and an existing (seemingly unused) struct session_context
. I haven't touched most of this, and I am not sure if it is TODO code or more old code that is no longer needed, but adding my new extended session struct may make things more confusing.
Briefly looking at the openssl code I think we would need to do similar there.
The error message in wget_ready_2_transfer I am not sure the best way to do so. Honestly the ability to call poll with an infinite timeout and no valid FD seems stupid to support but is technically a valid (or at least allowed) use case in *nix that hangs indefinitely. The definition says that any negative FD's are ignored, the idea being you invert the FD to avoid having to remove it from the set to just not watch it, although -1 isn't really a valid FD or inverted FD but would meet that condition. In addition it does explicitly call out that calling poll with no FD but a timeout just makes it a glorified sleep so that condition one wants to preserve. All of this is me saying that while calling with a invalid FD is legal, we might as well throw an error on our side. It is more a developer message than anything, and we are returning an error code but it could save someone some headaches. Doing it as a debug message seemed decent (it would have caught thing like some of the earlier wolfssl bugs fixed).
Finally, the other half of the deadlock resolution are the poll fixes. Here is my planned patch for upstream: https://github.com/mitchcapper/gnulib/compare/master...ours_win32_poll_fixes
we already have the poll_retry code in, my suggestion would be to add the poll function in its entirety under a different name (tmp_poll? updated_poll?) and change the poll_retry to use it. Once gnulib changes then we could just remove both and use the upstream function. I can do a PR for this if it sounds like a good idea.
Hey Mitch,
awesome work!
Once gnulib changes then we could just remove both and use the upstream function. I can do a PR for this if it sounds like a good idea.
Yeah, getting this into upstream is a good idea. Please do that first and when accepted, I'll update gnulib for wget2 (this sometimes comes with bugs/inconveniences, but I'll take care).
Meanwhile, I have time to think about your changes to WolfSSL. Did you ever test GnuTLS on Windows?