tink icon indicating copy to clipboard operation
tink copied to clipboard

Document how to run Tinkerbell in production

Open rgl opened this issue 5 years ago • 2 comments

Can you please provide guidance on how to securely run a tinkbell installation?

For example:

  • Should all machines have a dedicated NIC for DHCP? VLAN?
  • Should all machines be configured to Secure Boot?
    • Well, I don't secure boot described anywhere, so I guess this is not yet supported.
    • Should all machines be configured to secure boot from an https .ipxe endpoint?
  • Should a TPM be required? Will this support remote attestation of some kind?
  • Should we configure the underline networking infrastructure to disable DHCP packets from non DHCP servers?
  • setup.sh seems nice for development purposes, but for production, we should probably use k8s? Other orchestrator?
  • What about HA/DR?
  • Etc.

rgl avatar May 26 '20 07:05 rgl

I think it would make total sense for there to be a "Best Production Practices" document, but to be honest, so much of it is dependent on the environment and what kind of threat vectors you are worried about -- and less so Tinkerbell specific.

Here is my wild attempt at answering some of these questions:

  • A dedicated NIC for DHCP is unlikely to be helpful
  • A dedicated VLAN depends on whether or not you trust your network
  • Secure Boot can be problematic for many operating systems.
  • Tinkerbell doesn't know about TPM's -- it's out of scope (but very useful for production)
  • It isn't clear what benefit there would be to filtering out DHCP traffic in particular. If you have nodes that shouldn't talk to the DHCP server, you could be protective and firewall them out from that network entirely -- but what should happen when you want to reinstall it?
  • Kubernetes is a great way to go.
  • HA/DR is out of scope for a security doc, but Kubernetes does make it easy.

You may find this worth reading:

https://software.intel.com/content/www/us/en/develop/blogs/network-boot-in-a-zero-trust-environment.html

Since no activity has happened on this doc, I'm going to try to reword it to see if we get more action on it later.

tstromberg avatar Jul 27 '21 15:07 tstromberg

That intel document has a nice summary: use uefi https (with mutual authentication) and secure boot. I hope tinkerbell can aid in deploying this somehow :-)

It isn't clear what benefit there would be to filtering out DHCP traffic in particular. If you have nodes that shouldn't talk to the DHCP server, you could be protective and firewall them out from that network entirely -- but what should happen when you want to reinstall it?

I meant to say that we should perhaps prevent non DHCP server machines from replying to DHCP requests (at the network equipment level I guess). The clients would still be able to make requests (but not replies).

Somehow the system should also prevent clients from impersonating other clients I guess.

rgl avatar Jul 27 '21 17:07 rgl

There's a holistic documentation effort being tracked by https://github.com/tinkerbell/roadmap/issues/5. Please refer to that issue for more information.

chrisdoherty4 avatar Dec 27 '22 03:12 chrisdoherty4