getssl icon indicating copy to clipboard operation
getssl copied to clipboard

Error with Azure DNS + LetsEncrypt domain validation token that starts with a dash char

Open lukexcom opened this issue 4 months ago • 0 comments

Scenario:

Creating/renewing a certificate using LetsEncrypt as the CA and Azure DNS for domain validation fails when the LetsEncrypt token starts with a dash character.

When generating a new LetsEncrypt certificate, LetsEncrypt provides a domain validation token that must be added as a TXT record to the relevant DNS server that's authoritative for that domain. To do so for Azure, getssl uses the DNS scripts in ./dns_scripts/dns_add_azure to create the relevant DNS records with the LetsEncrypt-provided token.

The relevant command to add the TXT record in that script is: az network dns record-set txt add-record --record-set-name -g "$AZURE_RESOURCE_GROUP" -z "$zone_id" -n "$recordset" -v "$token"

However, that token, as provided by LetsEncrypt, sometimes starts with a dash character.

This means that the actual Azure CLI command to create the validation TXT record becomes (I'm only expanding the value for the -v parameter here): az network dns record-set txt add-record --record-set-name -g "$AZURE_RESOURCE_GROUP" -z "$zone_id" -n "$recordset" -v -qHFsHOWCADL7w_Hs3y...

The -v parameter requires an argument to follow immediately after it. However, Azure CLI cannot process the above command, as it believes that the next argument immediately after the -v is another flag or parameter, instead of an argument.

As stated in the Azure CLI issue 2588, this is due to an underlying CPython issue 53580 with argparse. However, Python developers have decided that the issue is effectively unfixable, and thus the Azure CLI is effectively stuck with this issue for the long run. So the ultimate cause of this issue does not lay with getssl, or with LetsEncrypt, or with Azure CLI, but with Python.

Suggested Fix #837 :

Despite there being no expected resolution to the underlying issue, one successful workaround here is to rename the last parameter from -v to --value=, which would make the resultant command look like so: az network dns record-set txt add-record --record-set-name -g "$AZURE_RESOURCE_GROUP" -z "$zone_id" -n "$recordset" --value="$token"

And in its partially-expanded form, it would look like this with the actual "offending" token value: az network dns record-set txt add-record --record-set-name -g "$AZURE_RESOURCE_GROUP" -z "$zone_id" -n "$recordset" --value=-qHFsHOWCADL7w_Hs3y...

This forces Python's argparse to treat the provided token argument to the value field the way it's supposed to be treated, i.e. a token starting with a dash is indeed the token value to be inserted as a DNS TXT record, and should not be treated as the start of the next parameter/flag.

Having manually made the above modification to the ./dns_scripts/dns_add_azure command on my systems, I no longer encounter this issue. Can the same change be made in the getssl dns scripts?

Steps to reproduce the behavior:

Perform sucessive certificate generations using LetsEncrypt's Staging API and Azure DNS until LetsEncrypt issues a token that starts with a dash character.

Or, alternately, manually run the az network command from Bash with the appropriate parameter values and with a value token that starts with a dash.

The resultant error will appear as such in the getssl's stdout - in this (sanitized) output, the output is from a second attempt to generate a certificate with three SANs, with the "dashed" token occuring on the first SAN. The seven lines starting with "argument" are all output from the Azure CLI that then becomes embedded in the getssl output:

Registering account
Verify each domain
xxx.xxx.xxx is already validated
Verifying xxy.xxx.xxx
argument --value/-v: expected at least one argument

Examples from the AI knowledge base:
az network dns record-set txt add-record --resource-group MyResourceGroup --zone-name www.mysite.com --record-set-name MyRecordSet --value Owner=WebTeam
Add a TXT Record.
https://docs.microsoft.com/en-US/cli/azure/network/dns/record-set/txt#az_network_dns_record_set_txt_add_record
Read more about the command in reference docs
getssl: DNS_ADD_COMMAND failed for domain xxy.xxx.xxx

Affected System: getssl 2.48, Aure CLI 2.56.0, Python 3.11.5 (embedded in Azure CLI), RHEL 8.8 x86-64, Bash 4.4.20(1)-release

lukexcom avatar Feb 15 '24 15:02 lukexcom