acme2certifier
acme2certifier copied to clipboard
acme.sh fails with "Sign error, wrong status" when a2c ca_server.get_cert() fails with error: The NETBIOS connection with the remote host timed out.
When using acme.sh with a2c and mswcce_ca_handler.py, there's a strange behavior that happens.
All VMs (a2c, acme.sh, acme-dns, MS CA and domain controller) are all in one network and have direct access. in this environment it's on Azure VMs. But I also reproduced the same with VMs running on VMware workstation.
- a2c finishes the validation and tries to enroll for a certificate from MS CA using mswcce_ca_handler
- the certificate gets issued by the MS CA, but a2c shows NETBIOS connection error when it's trying to get the certificate
- acme.sh ends with error Sign error, wrong status
- acme.sh tries again with a new request and all goes smoothly
any idea what could be the main issue here? is it possible to re-try to pull the certificate from the MS CA after the NETBIOS failure?
Logs from both acme.sh and a2c are attached here. a2clog_amcesh.txt acmeshlog.txt
I see this error from time to time in my lab as well. In the past I thought its related to my setup (i am accessing the MS CA via ssh tunnel) but it does not seem to be the case.
I will look into it during the upcoming days however are you saying that the same setup works fine with other acme clients? Is it a permanent issue when using acme-sh?
I normally test with acme.sh so I've seen this error mostly when using acme.sh.
however, there was one instance where it also happened with cert-manager.
the error appears more and more now, from both cert-manager and acme.sh.
any idea yet?
It's unlikely that the choice of ACME client would affect a server side NETBIOS connection. The error in question is being thrown by the impacket library: https://github.com/fortra/impacket/blob/master/impacket/nmb.py#L285 however the wording is their default for that exception type and the point where it occurs will vary.
Beware cached name resolution if your target machines IP address will change.
@webprofusion-chrisc I agree, I was just answering the previous question if the error occurs from different clients.
I'm not sure if the cached name resolution is the issue here. the target machines (both DC and ADCS CA) have static IP addresses.
Also it's worth noting:
- on ADCS CA, the certificate gets issued (so the request does reach the CA)
- right after the error, when the client tries again (within seconds) it gets a cert successfully
I was thinking to have a temporary workaround to read the error in the exception at https://github.com/grindsa/acme2certifier/blob/bc9deb61aa8d414b3e33298f08a0fc01555f0d4d/examples/ca_handler/mswcce_ca_handler.py#L241-L243 and if the error is "the NETBIOS connection with the remote host timed out" then to simply try the cert_raw = convert_byte_to_string(request.get_cert(convert_string_to_byte(csr))) again before failing.
the workaround didn't work after all.
I tried to catch the error, sleep for 2 seconds then try building the request again request = self.request_create(), but the same error showed at the retry.
Hi,
Sorry for not commenting earlier but i was quite busy the last few weeks.
I agree that the error is most likely not related to the acme-client. The reason for asking is that I am looking for a reliable way to replicate the issue.
Let me give it another try over the weekend.
/G.
Hi,
Sorry, I am still not able to replicate the issue. However, its worth to try if increasing the timeout of the dce-connection will help you to overcome the issue. Default is 5 seconds, maybe a higher value works better in your environment.
I updated the handler and introduced an timeout option in acme_srv.cfg to make the timeout configurable.
[CAhandler]
...
timeout: 20
Please give it a try with the updated handler) and check if things get better.
Closed due to inactivity.... In case you would like to follow up please re-open....
I apologize for the late response.
actually the solution with the timeout provided seems to have fixed the problem. I applied it a few weeks ago and sense then the problem didn't occur again.
Thank you so much for the fix :)