docker-machine-driver-hetzner icon indicating copy to clipboard operation
docker-machine-driver-hetzner copied to clipboard

Support the creation of servers with only private networks

Open eedugon opened this issue 1 year ago • 9 comments

Recently Hetzner has added the support of servers belonging only to private networks, without any public interface.

This is a great achievement for security and architectural purposes, and it would be great if this driver for Rancher supported the creation of the servers in this new way.

In order to create a server without public IPv4 and v6 Hetzner has added 2 new flags documented here: https://docs.hetzner.cloud/#servers-create-a-server

public_net object

With the CLI (hcloud) we only need to use the parameters --without-ipv6 --without-ipv4 when creating the server, and with the go library used in this project I assume we should just add the public_net new object with the enable_ipv4 and enable_ipv6 booleans set to false when creating the server.

Of course this should be used together with the existing option of the driver to "use private networks".

Hope you find this proposal interesting.

eedugon avatar Jul 10 '22 23:07 eedugon

Should work with 3.8.0 now, could you try it please?

JonasProgrammer avatar Jul 15 '22 17:07 JonasProgrammer

Thanks a lot for the quick PR! Looking forward to trying this out.

As a Rancher user I use this driver together with the UI/JS package that takes care of the calls (https://github.com/mxschmitt/ui-driver-hetzner), I haven't used the driver directly myself.

Should I modify the other code (it belongs to a different GH repo) in order to try this out?

Or what's your suggestion to try this? The code and changes look good to me and I believe it will do exactly what I was suggesting.

eedugon avatar Jul 15 '22 19:07 eedugon

@JonasProgrammer : I have tested the 3.8.0 driver together with a customization made on https://github.com/mxschmitt/ui-driver-hetzner and it works fine. I'm able to create servers with only private networks. Thanks a lot, I think we can close this issue!

eedugon avatar Jul 20 '22 06:07 eedugon

@eedugon did you commit your changes to ui-driver-hetzner to a fork / PR somewhere ?

hoerup avatar Jul 20 '22 08:07 hoerup

@hoerup : nope, I couldn't open any PR because I was unable to build the components.js properly from the source. I believe that repository (ui-driver-hetzner) is not really aligned with the public delivery of component.js for Rancher 2.x, so it looked impossible to me to create a proper and valid component.js similar to the published one from that repo, hence I couldn't contribute at all :(

Maybe it's because I'm missing something silly as I'm not a developer but a sysadmin.

I created 2 issues there to make note of that:

  • https://github.com/mxschmitt/ui-driver-hetzner/issues/130
  • https://github.com/mxschmitt/ui-driver-hetzner/issues/129

What I did was an ugly hack towards the published components.js to make use of the new option of the driver.

I explain that in a comment here.

If you want to try it out feel free to take my changes from https://gist.github.com/eedugon/66b8f7fce3d059faefe790bc5a7190be. Remember that you will have to host that file together with component.css and hetzner.svg in a web server.

eedugon avatar Jul 20 '22 09:07 eedugon

Just wanted to note that servers without public networks (IPv4/6) are not allowed to talk outside world. Maybe that make sense to state this in docs. Cheers.

martyrs avatar Aug 25 '22 15:08 martyrs

@martyrs : in order to have external connectivity keeping only the internal interface you would need to deploy your own NAT gateway / firewall in the network (until Hetzner offers that as a service, which I don't know of it's on their plans).

I hope to publish soon a how-to guide to accomplish this setup, because it's not difficult and works pretty well.

eedugon avatar Aug 27 '22 09:08 eedugon

@eedugon yeah should be pretty straightforward using wireguard. (wireguard is available on hetzner cloud images)

martyrs avatar Aug 27 '22 19:08 martyrs

We have been using this feature for the past couple of days quite intensively. Unfortunately though, it does not work reliably. Not at docker-machine-driver-hetzner's fault, but Hetzner's fault.

I have seen 3 cases:

  • A machine hangs in the "spinning power button"-phase (see the power btn). image It stops to spin at some point, and:
    • sometimes when I manually start it it works.
    • sometimes when I manually start it, it fails with a server.start error event.
  • A machine is created but the attach to network step fails with a server.attach_to_network error event.

image

The first case is the one I see most often. In that case the docker-machine-driver-hetzner fully hangs waiting for the machine to get in a ready state. If my manual start succeeds, the docker-machine-driver-hetzner continues. I have seen 5-6 machines at the same time in this "error'ed" offline state.

Hetzner is aware of this problem and not likely to fix anytime soon. Even though they don't mention the fact that the Private Networking feature is in beta/alpha or unstable anywhere, they do not seem to prioritize a fix for this (even though it is seemingly in production). I quote:

we don't have any solution for that currently. The only workaround is as mentioned to create the servers without private network and attach them later to it.

We know that this is a feature we offer and it should work as expected. We are working on it, however we can't give you any ETA.

Their solution is to start (or only create, not sure) an Instance w/o Private Network and attach it later. Should we perhaps:

  • Add a timeout for machine creation. Then delete and retry?
  • Add an option to first create the machine w/o autostart, attach the network and then start it?

Wouter0100 avatar Sep 30 '22 10:09 Wouter0100

I can see how this is causing headaches, so I want to try to at least somewhat mitigate this. On the other hand, I would like to keep the impact limited. as I am not really keen to introduce too much complexity (read: things I can do wrong) to work around something (hopefully) temporary.

Add a timeout for machine creation. Then delete and retry?

Not really a fan of this, as it would cause headaches of all sorts. This would introduce a whole lot of complexity pretty much everywhere, and the scope is not really clear: Do we do this for servers only? If so, why? If not, should every single API call be retried/where to draw the line? Driver aside, this could be implemented on a higher level, however. Just have some watchdog process run alongside and then send a signal to docker-machine, if still running. Granted, the cleanup could be a bit messy due to possible race conditions but that is the case even now, if you interrupt the creation in the wrong phase.

Add an option to first create the machine w/o autostart, attach the network and then start it?

That I can get behind. It pretty much affects only the creation stage and should be fairly straight-forward (remove from CreateOpts, add additional call later). The question is, is the problem really just related to the attachment on creation? Or is it creation with rapid attachment following it? If the latter is the case, then we will need a delay in between. Perhaps that in itself should be an option?

JonasProgrammer avatar Oct 07 '22 22:10 JonasProgrammer

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jan 07 '23 20:01 stale[bot]

@JonasProgrammer I guess this could be closed by now? No feedback in 2 months, and does the original problem even exist any more?

perlun avatar Mar 07 '23 13:03 perlun

I'm not entirely sure everything is fixed, given the fact that we only recently introduced a flag to wait on server creation so outside orchestration does not run in loops, but I guess the flags are there and the lack of response is an acceptable argument.

JonasProgrammer avatar Mar 08 '23 20:03 JonasProgrammer