terraform-provider-rancher2
terraform-provider-rancher2 copied to clipboard
[DNM until after 2.8 release] Migrate from sdk-v1 to sdk2-v2
DNM: waiting for release of 4.0.0
Issue: #1080
Observation: I Had a lot of conflicts on the PR #1207 IMO it was safer to close that one and create a new one cherry-picking my changes than it was to resolve the conflicts on that pr.
Problem
Migrate SDK1 to SDK2
Solution
Migrate SDK1 to SDK2: Main changes:
-
Change Create, Read, Update, Delete fields to CreateContext, ReadContext . . .
-
Change function signatures from
dataSourceRancher2....( d *schema.ResourceData, meta interface{})error todataSourceRancher2...(ctx context.Context, d *schema.ResourceData, meta interface{}) diag.Diagnostics -
Change some error handling to behave with
diag.Diagnostics. -
Remove
MaxItemsfrom TF arguments withCompiled: true. Required by migrating sdk1 to 2. -
Add some argument that were being set without being on the schema. Required by migrating sdk1 to 2.
-
Changed some types and added some so they correspond to the schemas. Required by migrating sdk1 to 2.
Observation: I had a loot of errors on the Tests, mainly 401 during the delete of TestAccRancher2Upgrade. I Did 2 commits to add some debug logic to that test and on that moment It just worked. I'm letting the debug (prints) on this PR so if I go back having that error I can confirm if the envs were being overwritten without need (I do believe that this was causing the error).
Testing
Engineering Testing
Manual Testing
I Tested on TF trying to create, destroy scale and change objects.
DO-RKE2-Basic.txt AWS-RKE2-CUSTOM.txt aws-rke1-custom.txt
Automated Testing
I ran all automated tests, also had to change some to match the SDK upgrade.
QA Testing Considerations
This is a huge change, I wasn't able to get a bug that happened due to the migration, that said please test this intensively.
Regressions Considerations
@felipe-colussi In general, nice job! Had some comments following up our discussion on the source code and schema updates last week. Please do the following
- Remove all comments/TODO comments left in the source code
- Clean up the error messages I referenced
- Post more details in
Manual Testing(types of cluster tested, test upgrade tf from old version to version with SDK v2 updates, tf config files) - Post test results for provisioning an RKE and v2 cluster with
cluster_agent_deployment_customizationandfleet_agent_deployment_customizationdefined. Verify it works and is showing the correct values in the tf state and cluster mgmt yaml
@felipe-colussi Can you also please squash the top 8-10 commits https://github.com/rancher/terraform-provider-rancher2/pull/1224/commits? They are all pretty small updates.
Looks like the build is failing due to go fmt. Run gofmt -w on these files
./rancher2/structure_cluster_v2_test.go
./rancher2/resource_rancher2_global_role.go
The changes got massively confusing on this PR so we did the following
git rebase masterand resolved the merge conflicts- verified changes by the upgrade to sdk v2 were still the same, checked build and tests are working and squashed 30 commits into 1 with a readable message.
@felipe-colussi If there are additional review comments, please include them in only 1-2 additional commits that are for addressed comments.
Issues with the auto-generated /vendor dir turned out to be unique to my local environment. Unused dependency fix will be put into a separate PR if needed.
Smoke tests after migration: Rke2 - AWS = Create Auth -> Cluster creation -> scale down -> scale Up -> Delete. OK.
RKE2 - DO = Create Auth -> Create Cluster -> Upgrade K8S version -> Delete. Ok.
RKE - DO = "Data" Auth -> Create Cluster with cluster_agent_deployment_customization -> Remove override_affinity+override_resource_requirements+fleet_agent_deployment_customization -> Increase node count -> Delete. Ok rke-do-customization.txt
RKE - AWS -> Create Auth -> Create Cluster (k8s version and network plugin only) -> increase node count -> dete. Ok.