terraform-provider-azurerm
terraform-provider-azurerm copied to clipboard
New resource: `azurerm_redhatopenshift_cluster`
azurerm_redhatopenshift_cluster
This PR introduces an Azure Red Hat OpenShift (ARO) provider targeting the issue #3614.
Known limitations
~~Currently the worker profile is immutable on the ARO API, more details here. In practice, it means that:~~
~~* Any change on the profile (e.g. nodes count, disk size) will require a cluster recreation.~~
- Using
oc
to scale theMachineSet
may cause problems as the cluster state may not be represented in Terraform state.
Update: Immutable worker profile was associated with the API
2020-04-30
version. Not applicable anymore for the2022-04-01
version.
@fedeoliv @tombuildsstuff Just wonder what is the status of this PR. seems this is way off the main branch. Just wondering what was the reason this was not merged initially? I am interested in this feature as well as contributing.
@datianshi - i believe it was because the worker profile is immutable so after terraform creates the resource there would be no good/easy way to handle any updates made to the worker profile outside of terraform.
@katbyte Thanks for the reply. It makes sense. But I still think it has value just for provisioning. It is the nature of the resource. Up to the end-user to know the risks of changing/updating. End-user can use lifecycle management to prevent destroy as well.
@katbyte does that mean this is not going to be added until RH changes the way RHO is deployed?
I agree with you @datianshi, I believe it's valuable providing the option of deploying ARO through Terraform even with this current limitation.
Well, apparently the new RHO version on the Azure SDK supports mutable worker profile attributes, see the 2022-04-01
version here. The previous 2020-04-30
version had immutable properties (details for comparison here).
I'll update the resource to use this new version to validate if that works and make the required adjustments.
FYI @fedeoliv I created an independent provider here https://registry.terraform.io/providers/rh-mobb/azureopenshift/0.0.2 This is with 2022-04-01 schema. Borrowed a lot of your code + the azurerm authentication. This is in my own repo. I think it should be part of the azurerm resource. Would sunset ours once this is merged and published.
Awesome @datianshi, thanks for sharing this one! I'm working on updating the schema, CRUD methods and integration tests to match this new version. I'll keep you posted 😊.
Hey @tombuildsstuff @katbyte @datianshi the PR is ready to be reviewed again 😊. I've updated the code and documentation to match the 2022-04-01
schema and introduced new acceptance tests for nodes encryption at host and FIPS cryptographic modules.
what stops this branch from being approved and merged?
what stops this branch from being approved and merged?
@tombuildsstuff wondering if you have any update on this review?
@tombuildsstuff could you merge it?
not sure what client id it is referring to?
@katbyte looking at the error message apparently we can't associate a service principal with multiple clusters (1-1 relationship). The options I have in mind is running the acceptance tests sequentially, or somehow generate new service principals for each test. Any suggestion?
@fedeoliv - we can run them sequentially, also if that is a service limitation we should document it in the docs
@fedeoliv - we can run them sequentially, also if that is a service limitation we should document it in the docs
Great, I updated the ARO documentation with a note about this limitation.
@fedeoliv - we still need to change the tests to run sequentially
@fedeoliv apologies for the contradiction - we can we update this to use a new Service Principal created as a part of each test - we can use the AzureAD provider to do that - see this test as an example - which'd mean we can run these in parallel and avoid this limitation.
@fedeoliv apologies for the contradiction - we can we update this to use a new Service Principal created as a part of each test - we can use the AzureAD provider to do that - see this test as an example - which'd mean we can run these in parallel and avoid this limitation.
That's great, thanks for pointing an example @tombuildsstuff! I'll update the tests with your recommendation!
Hi @fedeoliv Thanks for your work on this! I've also have been behind the scenes prodding the right folks to review this PR since we also use OpenShift for Consul K8s testing. Looking forward to see this get merged!
Any ETA on this? @fedeoliv Amazing work this has been a long time coming.
Hey @fedeoliv - seems like the tests are failing due to only one cluster being allowed. is this a service limit that can be increased or?
Test Failed
------- Stdout: -------
=== RUN TestAccOpenShiftCluster_private
=== PAUSE TestAccOpenShiftCluster_private
=== CONT TestAccOpenShiftCluster_private
=== CONT TestAccOpenShiftCluster_private
testcase.go:110: Step 1/1 error: Error running apply: exit status 1
Error: creating Red Hat OpenShift Cluster "acctestaro220929200149494479" (Resource Group "acctestRG-aro-220929200149494479"): redhatopenshift.OpenShiftClustersClient#CreateOrUpdate: Failure sending request: StatusCode=400 -- Original Error: Code="DuplicateResourceGroup" Message="The provided resource group '/subscriptions/*******/resourcegroups/aro-cyufr257' already contains a cluster."
with azurerm_redhatopenshift_cluster.test,
on terraform_plugin_test.tf line 52, in resource "azurerm_redhatopenshift_cluster" "test":
52: resource "azurerm_redhatopenshift_cluster" "test" {
--- FAIL: TestAccOpenShiftCluster_private (185.03s)
also we'll need to fix up the failing GHA CI
s like the tests are failing due to only one cluster being allowed. is this a service limit that can be increased or
Hey @katbyte, humm seems this is a limitation. How did you get this error? Did it happen due to parallel tests execution? If yes, that's weird...I thought the random string value associated with a resource group name (e.g. accTestRG-aro-%d) wouldn't be a problem in terms of potential name conflicts.
@fedeoliv - i'm not sure, it seems like all the tests are using acctestRG-aro-%d"
so no idea where aro-cyufr257
is coming from? unless the service/API is doing something funky behind the scenes and not actually using the RG we specify?
@fedeoliv Any idea if we're able to resolve the test failures so we can get this PR merged?
@fedeoliv From chatting internally with some HashiCorp folks, it looks like a few clean up tasks are needed
go mod tidy go mod vendor should fix it up - the real issue here is the failing tests due to that random resource group conflict
I am still seeing three distinct test failures for this PR:
It seems like for some of the tests the SP needs additional permissions? could we update the tests and document that?
Hi @fedeoliv do you know more help here to get this one through?
@fedeoliv Pinging again, looks like some tests are still failing, did you need help understanding why these failures are occurring? Does OpenShift require a dedicated RG for testing on Azure? We currently use a shared RG here for tests based on chatting with the team.
@david-yu - is there a resource group behind the scenes used by the service? because we seem to be passing in different one for each test and not defining "aro-xxxxx" anywhere in code?
You need to define a resource group ahead of time before you provision an Azure RedHat Openshift cluster as per these docs: https://learn.microsoft.com/en-us/azure/openshift/tutorial-create-cluster. I typically create one myself manually using the CLI.