wireguard-go icon indicating copy to clipboard operation
wireguard-go copied to clipboard

Add illumos support

Open nshalman opened this issue 4 years ago • 22 comments

Wireguard (wireguard-go) for illumos

TODO:

  • [ ] Pass review
  • [ ] a wg-quick(8) implementation, like we have for FreeBSD and OpenBSD and such: https://git.zx2c4.com/wireguard-tools/tree/src/wg-quick
  • [ ] reliability testing via wg-quick

DONE:

  • [x] reliability testing via Tailscale (I build Tailscale for illumos on top of this work and it's great!)

Original Description

A currently working cleanup of the code by @jclulow from https://github.com/jclulow/wireguard-go-illumos-wip squashed down to a single commit.

This code works with the caveats noted in the commit message, and with additional modifications, can be added to the https://github.comtailscale/wireguard-go tree and be used to build mostly working tailscale binaries.

I barely know what I'm doing here, so any help carrying this across the finish line would be greatly appreciated.

nshalman avatar Aug 21 '20 00:08 nshalman

Pinging @crawshaw and @danderson Per https://github.com/tailscale/tailscale/issues/697#issuecomment-677823895

nshalman avatar Aug 21 '20 00:08 nshalman

No response from David per usual, so I'll review this one. As a preliminary, it sounds like the current shortcoming is the requirement of -f? Any other quirks?

I'll need to get a Illumos VM running to test this out in.

Also, while we're doing this, we'll also need a wg-quick(8) implementation, like we have for FreeBSD and OpenBSD and such: https://git.zx2c4.com/wireguard-tools/tree/src/wg-quick . Interested in doing that? This can be a lot more fiddly than wireguard-go, depending on what kinds of routing schemes are supported by the Illumos kernel.

zx2c4 avatar Aug 22 '20 19:08 zx2c4

get chided for not doing code review on the weekend

The comment was directed at David, not you Dave, who, by his own repeated admission, is often too busy for this stuff.

to encourage people to try and upstream useful things to wireguard-go (rather than keep them tailscale-specific)

If you consider that a remarkable thing to do, you've made it clear where you stand with regards to this ecosystem. Thanks for making that clear to me.

zx2c4 avatar Aug 22 '20 21:08 zx2c4

If you consider that a remarkable thing to do, you've made it clear where you stand with regards to this ecosystem. Thanks for making that clear to me.

I don't consider it remarkable, I don't see how that follows from what I said. Regardless, I'll resume reading this code on Monday, since I'd certainly like Illumos support in wireguard-go, and it sounds like we agree on that.

danderson avatar Aug 22 '20 21:08 danderson

Oh, and regarding wg-quick: we did some preliminary digging on the facilities available in illumos.

The networking facilities are similar to the BSDs (no policy routing, IP_BOUND_IF socket option), so that part of the wg-quick implementation looks almost identical to the openbsd userspace mode: overlay routes for endpoint IPs, and /etc/resolv.conf replacement for DNS configuration.

For running and monitoring the daemon, SMF obviously springs to mind, but would lead to different semantics than the other wg-quick implementations (notably, wg-quick up would survive reboots until turned back off with wg-quick down). So there again, for uniformity of behavior the implementation would look like what openbsd does when kernel support is missing.

Happy to send the patches for wg-quick.

danderson avatar Aug 22 '20 22:08 danderson

Thank you both, @zx2c4 and @danderson, for the feedback. I'll get started on cleaning up the stuff that I already understand, and work on figuring out the bits I don't understand just yet. :)

In the meantime, replying to some non-code threads:

As a preliminary, it sounds like the current shortcoming is the requirement of -f? Any other quirks?

No others that I noticed, though the only exercise of this code on my end was that it worked connecting to a raspberry pi running the kernel module based version running on a rasberry pi a while back, and seems to mostly work when used by my hacked-up fork of tailscale.

For running and monitoring the daemon, SMF obviously springs to mind, but would lead to different semantics than the other wg-quick implementations (notably, wg-quick up would survive reboots until turned back off with wg-quick down). So there again, for uniformity of behavior the implementation would look like what openbsd does when kernel support is missing.

SMF's svcadm enable does support a -t flag which makes it temporary; svadm enable -t svc:/wireguard/wg-quick:default (or however it would be named) would not remain enabled across a reboot.

nshalman avatar Aug 23 '20 01:08 nshalman

The other question we should resolve soon-ish is whether we're going to try for SMF integration in wg-quick, or go down the less-managed route that the BSDs took.

Thanks for offering to work on this. If you're big into Illumos, taking the lead on the wg-quick integration would be much appreciated. I'd prefer to keep the same semantics as all the other wg-quick implementations. If we want to later add deeper Illumos integration with SMF, that sounds fine, but would be a separate endeavor. For now, let's get everything all lined up.

Usually there are a few problems to solve in wg-quick porting, beyond obvious things:

  • How do you change the DNS server in an exclusive way? Most implementations use resolvconf -m 0 -x, which is preferable. Some use the dns hatchet (see contrib/), which is grotesque. Some just overwrite /etc/resolv.conf, which is insufficient. I assume Illumos has something nice and uniform we can use.
  • How do we exempt the WireGuard socket from the default routing rule? Linux uses fwmarks, for example. [@nshalman - you'll want to add an implementation for this in mark_unix.go, by the way.]
  • How do we overwrite the default route without removing the prior default route? Linux uses multiple routing tables. Some of the BSDs &disown a little route monitor in bash. Hopefully Illumos has a Linux-like solution that doesn't require background processes.
  • How do you give an IP address to an interface without providing a "gateway" IP? Linux has the dev wg0 notation. Some BSDs support a -interface switch to ifconfig. Others rely on gross hacks. Hopefully Illumos has something clean.

I would suggest reading all of the current wg-quick implementations, except perhaps the Android one, before starting, and then maybe start with one of the BSD implementations and tweak it.

If you have additional questions about wg-quick development, I suggest putting this discussion on the mailing list, where wg-quick development happens, and where mailing list lurkers periodically pop up with some nice domain specific knowledge.

zx2c4 avatar Aug 24 '20 21:08 zx2c4

Thanks for offering to work on this. If you're big into Illumos, taking the lead on the wg-quick integration would be much appreciated. I'd prefer to keep the same semantics as all the other wg-quick implementations. If we want to later add deeper Illumos integration with SMF, that sounds fine, but would be a separate endeavor. For now, let's get everything all lined up.

Sounds good. In that case, the Illumos implementation of wg-quick is likely to be nearly identical to the OpenBSD implementation:

  • How do you change the DNS server in an exclusive way? Most implementations use resolvconf -m 0 -x, which is preferable. Some use the dns hatchet (see contrib/), which is grotesque. Some just overwrite /etc/resolv.conf, which is insufficient. I assume Illumos has something nice and uniform we can use.

Illumos has no facilities for this that I could find (no resolvconf, and no Illumos-specific facility for managing DNS). It also lacks the piecemeal namespacing support that dns hatchet relies on, so we're stuck with replacing /etc/resolv.conf.

  • How do we exempt the WireGuard socket from the default routing rule? Linux uses fwmarks, for example. [@nshalman - you'll want to add an implementation for this in mark_unix.go, by the way.]

There's no policy routing facilities. Similar to OpenBSD, we'll have to add /32 or /128 routes to the endpoints.

  • How do we overwrite the default route without removing the prior default route? Linux uses multiple routing tables. Some of the BSDs &disown a little route monitor in bash. Hopefully Illumos has a Linux-like solution that doesn't require background processes.

Again, similar situation to the BSDs, no good solution. AFAICT, the current FreeBSD and OpenBSD scripts just add/delete routes per the wg-quick configuration and don't handle conflict at all. The route monitor updates endpoint routes to avoid loops, but does nothing to handle repairing the routing table on exit.

  • How do you give an IP address to an interface without providing a "gateway" IP? Linux has the dev wg0 notation. Some BSDs support a -interface switch to ifconfig. Others rely on gross hacks. Hopefully Illumos has something clean.

Illumos's route(1M) supports -interface, so again should be the same as OpenBSD.

Based on this, once we've merged the initial Illumos support, I'll probably start with a copy of the OpenBSD implementation, then circulate that in the Illumos community to see if there's something we can do better.

danderson avatar Aug 25 '20 02:08 danderson

Is my experimental change (https://github.com/golang/sys/compare/master...nshalman:illumos-wireguard) to x/sys/unix vaguely close to what it should be? If so, I have an alternate branch that I will push into this one that clears out the ASM file and the corresponding code. If not, additional guidance would definitely be welcome. Thanks so much everyone!

nshalman avatar Mar 12 '21 00:03 nshalman

With a lot of help from @danderson this is getting much closer to being ready to land.

nshalman avatar Mar 15 '21 18:03 nshalman

For the record - waiting until we get the upstream bits into x/sys/unix before I review this more closely again, so that we don't go through multiple rounds as review feedback alters the APIs.

danderson avatar Mar 16 '21 20:03 danderson

Just chiming in here to say I'm really looking forward to this! Thank you for hacking on illumos support!

If I can be of any help testing anything, I'd be more than happy to help.

teutat3s avatar Apr 21 '21 16:04 teutat3s

This is currently stuck waiting for me to find time to clean up / refactor https://go-review.googlesource.com/c/sys/+/302831 to the point that it's ready for merging. After that, it should be pretty straightforward to clean up this branch into something reasonable. No ETA unfortunately, but it's not abandoned either.

nshalman avatar Apr 21 '21 20:04 nshalman

Sorry it's been so long. All necessary unsafe things have been moved to x/sys/unix. I imagine there's still more work to be done, but I think this is another moment where it would be a good time for more eyes.

nshalman avatar Apr 27 '21 02:04 nshalman

Thanks for your continued work on this. I did a pass at it, as requested, but it looks like lots of other comments from before are still unaddressed and there are still many XXX: and TODOs in the code. So this stuff should probably be finished up before this is merged.

zx2c4 avatar Apr 27 '21 02:04 zx2c4

Still some more cleanup to do, but we're getting very very close. Not marking for review yet because there are still some unaddressed comments I should take care of.

nshalman avatar Sep 03 '21 02:09 nshalman

Can you test this all and remove the XXX comments before I do a full review?

Yes. Thank you for your patience with me.

If you have questions or need help with certain parts, feel free to say so in ordinary comments.

I vaguely understand what routineRouteListener is doing, but I don't fully understand how; it's the only function that has XXX comments remaining in it.

What is supposed to consume the events on the events channel and how do I confirm that they are being generated correctly? Does it have something to do with the ipc socket? How do I test the ipc socket? (Deleting it does make wireguard-go terminate, so that part seems to work, but I'm assuming it has some additional purpose?)

Also, a question about style. go fmt seems to have opinions about things like build tags being at the top of the file and which lines to put in there, while Dave is saying to put the license and copyright first and to remove one of them. Which one should I listen to?

nshalman avatar Sep 05 '21 12:09 nshalman

I vaguely understand what routineRouteListener is doing, but I don't fully understand how; it's the only function that has XXX comments remaining in it.

It's supposed to be monitoring for the interface going up, down, being removed (perhaps), or having its MTU changed.

What is supposed to consume the events on the events channel and how do I confirm that they are being generated correctly?

Take a look at the readers of this channel for freebsd or openbsd, for example.

For testing, generate the events that it's supposed to be monitoring -- ifconfig .. up/down/mtu/destroy -- and see if this reflects in the program.

Also, a question about style. go fmt seems to have opinions about things like build tags being at the top of the file and which lines to put in there, while Dave is saying to put the license and copyright first and to remove one of them. Which one should I listen to?

Do whatever the other files are doing. go fmt doesn't reformat the other files.

zx2c4 avatar Sep 05 '21 13:09 zx2c4

Also, a question about style. go fmt seems to have opinions about things like build tags being at the top of the file and which lines to put in there, while Dave is saying to put the license and copyright first and to remove one of them. Which one should I listen to?

Do whatever the other files are doing. go fmt doesn't reformat the other files.

Go 1.17 seems to have some new formatting opinions that Go 1.16 did not. I'll use 1.16 for formatting. Thanks!

nshalman avatar Sep 05 '21 13:09 nshalman

Oh, good point. Fixed: https://git.zx2c4.com/wireguard-go/commit/?id=2ef39d47540c4537f0ddd3355fb95b33b91c09b7

zx2c4 avatar Sep 05 '21 14:09 zx2c4

I haven't forgotten about this. @gco reached out to me to let me know that with a bit of copying functions around and s/illumos/solaris it was possible to get this code working on Solaris (My corresponding changes are here: https://github.com/nshalman/wireguard-go/compare/illumos-to-upstream...nshalman:wireguard-go:solaris). So https://go-review.googlesource.com/c/sys/+/437357 is open. When that is done, I hope to do some more cleanup of my wg-quick bits and get that all tested and squared away, but this has seen a lot of use by me and others, so hopefully I'll have it ready to land soon(ish).

nshalman avatar Oct 09 '22 02:10 nshalman

Yalright, well, just let me know when it's finally done with no TODOs left, and I'm happy to review it.

zx2c4 avatar Oct 09 '22 02:10 zx2c4

Yalright, well, just let me know when it's finally done with no TODOs left, and I'm happy to review it.

Looking forward to your review when you have the time. Thank you!!

nshalman avatar Oct 27 '22 14:10 nshalman

Noting for myself... This PR conflicts with #64. I tagged a version that I used with Tailscale 1.36 here: https://github.com/nshalman/wireguard-go/releases/tag/tailscale-1.36.0 / https://github.com/nshalman/wireguard-go/commit/0d59a42bb75d113d1ab95fbbbf7dcf66affb4c01

Not sure if @zx2c4 has a preference on which one lands first... It also looks like this needs a merge conflict cleanup from me on go.mod and go.sum

nshalman avatar Feb 27 '23 16:02 nshalman

I wasn't aware this was a serious effort, such that I should have sequencing concerns like that. It's been over two years now, so this is mostly your playground. But if you do think that the commit is done and mergable, please do pipe up so I can review and merge.

zx2c4 avatar Mar 03 '23 13:03 zx2c4

I wasn't aware this was a serious effort, such that I should have sequencing concerns like that. It's been over two years now, so this is mostly your playground.

For someone smarter than me and with more free time I sure this could all have been done faster.

But if you do think that the commit is done and mergable, please do pipe up so I can review and merge.

From a functionality perspective, all the testing that I've had the time and energy to do has shown this code to be complete. I do have concerns about the related https://github.com/WireGuard/wireguard-tools/pull/17 having some subtle bug that I haven't found yet, but it too has stood up to my testing.

I did request a review, but I probably wasn't explicit enough, so that's on me (https://github.com/WireGuard/wireguard-go/pull/39#issuecomment-1293646049):

Looking forward to your review when you have the time. Thank you!!

I will clean up the merge conflict and ping you very explicitly once that's done. I'll also mark the wireguard-tools PR ready for review and ping you on that one as well.

So. The next action is still on me and I will be very explicit in requesting review once I'm done with those steps.

Thank you for your attention!

nshalman avatar Mar 03 '23 14:03 nshalman

@zx2c4 please review.

nshalman avatar Mar 06 '23 14:03 nshalman

@jwhited your feedback would also be welcome.

nshalman avatar Mar 06 '23 14:03 nshalman

Rebased to clear merge conflicts. Still ready for review by @zx2c4.

nshalman avatar Mar 13 '23 14:03 nshalman

From your top message:

reliability testing via Tailscale (Tailscale for both Solaris and illumos can be built on top of this work and it has been quite reliable.)

Testing with third-party software is not sufficient. Test wireguard-go itself and then report back. Please write what you've tested and how.

zx2c4 avatar Mar 13 '23 17:03 zx2c4