terraform-provider-artifactory
terraform-provider-artifactory copied to clipboard
Failure to apply a lot of changes in one pass
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Affected Resource(s)
All during Create/Update/Delete operations
Terraform Configuration Files
pastebin link - 390 lines, 3 variables
Debug Output
5 error(s) occurred:
* artifactory_local_repository.maven_release: 1 error(s) occurred:
* artifactory_local_repository.maven_release: PUT https://code0x58test.jfrog.io/code0x58test/api/repositories/x-libs-release-local: 400 [{Status:400 Message:Could not merge and save new descriptor [org.jfrog.common.ExecutionFailed: Last retry failed: exceeded number of retries 5. Not trying again (Should update revision 252)]
}]
* artifactory_local_repository.maven_snapshot: 1 error(s) occurred:
* artifactory_local_repository.maven_snapshot: PUT https://code0x58test.jfrog.io/code0x58test/api/repositories/x-libs-snapshot-local: 400 [{Status:400 Message:Could not merge and save new descriptor [org.jfrog.common.ExecutionFailed: Last retry failed: exceeded number of retries 5. Not trying again (Should update revision 252)]
}]
* artifactory_local_repository.rpm: 1 error(s) occurred:
* artifactory_local_repository.rpm: GET https://code0x58test.jfrog.io/code0x58test/api/repositories/x-rpm-local: 400 [{Status:400 Message:Bad Request}]
* artifactory_remote_repository.npm: 1 error(s) occurred:
* artifactory_remote_repository.npm: PUT https://code0x58test.jfrog.io/code0x58test/api/repositories/x-npm-remote: 400 [{Status:400 Message:Could not merge and save new descriptor [org.jfrog.common.ExecutionFailed: Last retry failed: exceeded number of retries 5. Not trying again (Should update revision 252)]
}]
* artifactory_virtual_repository.pypi: 1 error(s) occurred:
* artifactory_virtual_repository.pypi: GET https://code0x58test.jfrog.io/code0x58test/api/repositories/x-pypi: 400 [{Status:400 Message:Bad Request}]
Expected Behavior
I'd expect it to succeed like when it was smaller, or as it does after a couple of applies.
Actual Behavior
Artefactory can't keep up. It looks like there's a race to save the config which is worked around with server side retries, but that doesn't work when too many changes occur at once.
Steps to Reproduce
-
terraform apply
Important Factoids
I suspect it's possible to do something like set MaxConnsPerHost to 1 on the transport of the HTTP client, that way an instance of the terraform provider shouldn't be introducing the races that it otherwise would.
Work arounds include:
- running
terraform apply --parallelism=1
which isn't super as other non-Artefactory providers will suffer - repeating
terraform apply
until the state converges (can lead to bad state)
There is a server side config option mentioned here that sets the number of retries, while not a solution, it should be a lead for reading up if needed.
I tried a crude patch to limit MaxConnsPerHost, but it didn't fix it:
diff --git a/pkg/artifactory/provider.go b/pkg/artifactory/provider.go
index cb41084..6f10fcd 100644
--- a/pkg/artifactory/provider.go
+++ b/pkg/artifactory/provider.go
@@ -62,16 +62,21 @@ func providerConfigure(d *schema.ResourceData) (interface{}, error) {
password := d.Get("password").(string)
token := d.Get("token").(string)
+ t := http.DefaultTransport.(*http.Transport)
+ t.MaxConnsPerHost = 1
+
var client *http.Client
if username != "" && password != "" {
tp := artifactory.BasicAuthTransport{
- Username: username,
- Password: password,
+ Username: username,
+ Password: password,
+ Transport: http.DefaultTransport,
}
client = tp.Client()
} else if token != "" {
tp := &artifactory.TokenAuthTransport{
- Token: token,
+ Token: token,
+ Transport: http.DefaultTransport,
}
client = tp.Client()
} else {
Duplicate of #9. The suggested workaround is to set parallelism to 1.
JFrog also provided an alternative solution, they recommended increasing this property to artifactory.central.config.save.number.of.retries=20
in artifactory.system.properties
. With this you can keep terraform multithreaded however we have noticed some infrequent errors with access related resources can occur when doing batch operations (such as users, groups, permissions). The ticket you linked is to fix these errors.
I have looked at client side throttling in the past, but I think the ideal solution would be retries with exponential backoff, this would have to be added to every resource. This is not a priority however since the issue is worked around easily.
Look at https://www.jfrog.com/jira/browse/RTFACT-16638 I have stil an issue with that.