terraform-provider-ad
terraform-provider-ad copied to clipboard
Provider produced inconsistent result [on Linux] after apply
Terraform Version and Provider Version
Terraform v0.14.10
- provider registry.terraform.io/hashicorp/ad v0.4.2
Windows Version
Linux Version
RHEL 7.9
Affected Resource(s)
ad_provider
Expected Behavior
I have a main.tf file that use ad_provider only, that perfectly works on WINDOWS but I would need to use it on Linux (that is reported to be supported).
Actual Behavior
It works on windows but not on Linux. On Linux it ends with the following output:
[... list of objects to create in Active Directory ...] Plan: 23 to add, 0 to change, 0 to destroy.
Do you want to perform these actions? Terraform will perform the actions described above. Only 'yes' will be accepted to approve.
Enter a value: yes
ad_ou.SDDC: Creating... ad_ou.SDDC: Creation complete after 7s [id=8807dd76-5262-474b-9875-179ca77213d0] ad_ou.PROVIDER: Creating... ad_ou.BUSINESS: Creating... ad_ou.SUBSCRIBER: Creating... [... many others ...] ad_user.sddc-fwal1a1ab2bp967: Creation complete after 5s [id=4d67549d-77b5-43cf-99bd-bde35553375b]
Error: Provider produced inconsistent result after apply
When applying changes to ad_ou.SUBSCRIBER, provider "registry.terraform.io/hashicorp/ad" produced an unexpected new value: Root resource was present, but now absent.
This is a bug in the provider, which should be reported in the provider's own issue tracker.
Steps to Reproduce
I can provide the main.tf if needed.
Community Note
First of all, I would need to know if this provider is really supported on Linux or I have to use it on a Windows machine. :-)
Same here :( It happens to me with ad_computer resource:
terraform apply
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated
with the following symbols:
+ create
Terraform will perform the following actions:
# ad_computer.test will be created
+ resource "ad_computer" "test" {
+ container = "OU=foo,OU=,DC=bar,DC=com"
+ dn = (known after apply)
+ guid = (known after apply)
+ id = (known after apply)
+ name = "test"
+ pre2kname = (known after apply)
+ sid = (known after apply)
}
Plan: 1 to add, 0 to change, 0 to destroy.
Changes to Outputs:
- data = "CN=test,OU=foo,OU=,DC=bar,DC=com" -> null
+ resource = (known after apply)
Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
ad_computer.test: Creating...
╷
│ Error: Provider produced inconsistent result after apply
│
│ When applying changes to ad_computer.test, provider "provider[\"registry.terraform.io/hashicorp/ad\"]" produced
│ an unexpected new value: Root resource was present, but now absent.
│
│ This is a bug in the provider, which should be reported in the provider's own issue tracker.
Hello! Our goal is for the provider to work on all platforms terraform is expected to run. If there's something wrong that doesn't work on Linux then we'll figure out a way to make it work.
I'd appreciate it if you could share your debug logs (Run terraform apply
with the environment variable TF_LOG
set to DEBUG
).
Hi and thanks a lot for your interest in this problem. I try to attach the log with TF_LOG=DEBUG. I've just changed some potentially sensitive information about local names, that shouldn't change the point.
I think I managed to reproduce the issue. Are you by any chance using a replicated setup for your domain controllers?
I think I managed to reproduce the issue. Are you by any chance using a replicated setup for your domain controllers?
Yes, the domain works in a multi-domain-controller setup. I don't know much about their configuration since I simply use them as a "client", but I can ask to our MS Windows sysadmin more information, if you need.
Thanks, Claudio
No need to, I think I understand what is going on. When we run the powershell commands we are probably hitting different members of the DC cluster, and if we ask before the object is replicated to all members then we see the errors you are getting.
I am looking into this further. Thanks for reporting this :)
Yes, it makes sense. Although it is not clear to me why running the same main.tf from an MS Windows machine, the problem does not arise...
In any case, I expect to have further information from you. Thanks again for your time. :-)
Good morning. I've just made a test using the name of just ONE of the server in the DomainController-pool alias in winrm_hostname, forcing then the usage of only one node, and it works perfectly also on Linux.
So, the question is: why ever does it works on Windows using the DomainController-pool alias in winrm_hostname
I was wondering if there has been any movement or change regarding this issue. The organization I work for is also having the same problem when trying to create any new AD objects. i can also provide additional logs if needed.
Would something like below work?
func DomainControlerComputer(conf *config.ProviderConf) {
cmd := fmt.Sprintf("(Get-ADDomainController | Select-Object -first 1).Hostname")
conn, err := conf.AcquireWinRMClient()
if err != nil {
return nil, fmt.Errorf("while acquiring winrm client: %s", err)
}
defer conf.ReleaseWinRMClient(conn)
psOpts := CreatePSCommandOpts{
JSONOutput: true,
ForceArray: false,
ExecLocally: conf.IsConnectionTypeLocal(),
PassCredentials: conf.IsPassCredentialsEnabled(),
Username: conf.Settings.WinRMUsername,
Password: conf.Settings.WinRMPassword,
Server: conf.Settings.DomainName,
}
psCmd := NewPSCommand([]string{cmd}, psOpts)
result, err := psCmd.Run(conf)
if err != nil {
return nil, fmt.Errorf("winrm execution failure in DomainControlerComputer: %s", err)
}
if result.ExitCode != 0 {
return nil, fmt.Errorf("Get-ADDomainController exited with a non zero exit code (%d), stderr: %s", result.ExitCode, result.StdErr)
}
return, nil
}
you could then specify the server when you run the powershell commands.
Get-ADComputer -Server <DOMAIN_CONTROLLER_FROM_ABOVE>
I believe the problem could be due to connecting two sepearte winrm sessions during the create, one for the create and one for the read. Each winrm session will likey have differing logonservers as a result of DNS round robin and because replication can take up to a minute between domain controllers in the same Site it will fail to read the object from a different DC. A solution could be updating the Provider so it shares a single session when performing the create and read (assuming it is currently two - I have not actually confirmed).
I've been struggling with this problem too. Apparently, it's not the WinRM session that determines the DC, but rather every invocation of powershell.exe
has a chance of connecting to a different controller. This behavior appears to hold regardless of whether PowerShell is launched from inside a WinRM session or a local prompt inside a Windows desktop session. It would be nice to have some official confirmation of this, but I couldn't find anything in Microsoft documentation.
In any case, a solution like the one provided by @ncecere would be enough to work around the issue. I would suggest adding -Discover
and -ForceDiscover
to the Get-ADDomainController
command to make sure that an available and possibly different DC is returned at every invocation. Overall, this "DC stickiness" feature would be disabled by default, unless an explicit TF parameter is set. If the proposal looks good, I can offer my help and prepare a pull request with the needed changes.