[Epic] Provide an AMI image of the VM used by CRC
Assigned: @gbraad, @praveenkumar
- [ ] disk layout (can conversion work or is install from the current base image necesary)
- [ ] handle cluster name change
- Note: internal routes are static
- [ ] cloud-init settings provisioning
[Spike] [Stretch goal] [important]
As also pointed out by @bbrowning a straight conversion is not possible due to the way AWS uses the a simpler disk layout than what the curent Libvirt-based images uses (partitioned)
Some of the challenges which we will face for AMI side as pointed out by @bbrowning
-
The partitioning layout (among other things) used by RHCOS isn't supported for import by AWS. Both the EFI bootloader partition and the GPT partitioning scheme integral to CoreOS's disk layout (https://coreos.com/os/docs/latest/sdk-disk-partitions.html inspired from the x86 EFI layout at http://www.chromium.org/chromium-os/chromiumos-design-docs/disk-format) is integral to how CoreOS works.
-
If running the single node directly on the aws then several places where OCP expects to continue controlling resources in the AWS account that will be hard to decouple in order to create a generic AMI image.
-
A viable path forward here is a UPI install on AWS of a single node cluster where no cloud credentials are given to any pods in the bootstrap or master nodes. Once that single node cluster is up without any AWS-specific integrations enabled inside of it, it should be possible to then save it off as a reusable AMI in a spirit similar to what the createdisk.sh script in code-ready/snc does for libvirt.
-
Something will still need to do the equivalent of
crc startto configure everything inside the AMI when a user starts an instance of it. And, because these are not running on a local laptop, additional steps will need to be taken to rotate the ssh keys, kube admin password, and all certificate / certificate authorities (or at least kubeadmin client-cert auth blocked) so that the clusters spun up are not easily "rooted". -
Never expose a CRC instance to the internet at large without a reverse proxy sitting in front of its API server because all CRC installs share the same set of certificate authorities which means any API server client cert valid for one is valid for all. Thus, you have to not use TLS client certs and explicitly not allow it, which means putting a TLS-terminating reverse proxy in front of the API server.
I think that nicely sums up all the challenges I'm aware of - thanks @praveenkumar . One thing I have not looked into is how the existing RHCOS AMIs get uploaded to begin with. How are they building the AMI to get around the EC2 limitations around importing VM images?
@ashcrow Can you perhaps explain how the AMI is generated to deal with the EC2 limitations with regards to the MBR/GPT?
I think there may be some misunderstanding. The link referenced is for Container Linux which is not FCOS or RHCOS.
For RHCOS the AMI is generated off the qemu image and uploaded via cosa's internal ore. Before we upload it we do convert the qcow into a vmdk with specific options.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
@guillaumerose these detail the needed changes
@anjannath Please have a look at some of those issues.
I think now this is the scope of crc-cloud project so closing it from here.