OpenDKIM icon indicating copy to clipboard operation
OpenDKIM copied to clipboard

key retrieval failed ... reply truncated

Open porjo opened this issue 6 years ago • 5 comments

I'm using opendkim 2.10.3, seeing lots of these kinds of messages in the log:

opendkim[107]: 69C74D511E4: key retrieval failed (s=aweber_key_a, d=aweber.com): \
'aweber_key_a._domainkey.aweber.com' reply truncated
opendkim[107]: BBFC7D4F68B: key retrieval failed (s=20161025, d=gmail.com): \
'20161025._domainkey.gmail.com' reply truncated

According to https://stackoverflow.com/a/54893405/202311 it may be due to missing UDPSize option in DNS lookup query.

porjo avatar Jun 06 '19 04:06 porjo

This is logged, when the function libopendkim/util.c:dkim_check_dns_reply() returns 1, and this means “1 -- reply truncated but usable”. Insert distinguishable syslog() statements on the three places this function returns 1. Then you can see where “return 1” is triggered and this will help to find why 1 is returned.

2.10.3 is known to have errors, use opendkim from the develop git branch here.

dilyanpalauzov avatar Jul 05 '19 13:07 dilyanpalauzov

For the record, I use opendkim with libunbound for handlng the DNS and do not have this problem.

dilyanpalauzov avatar Jul 18 '19 13:07 dilyanpalauzov

The error means the reply we got from the resolver had the "truncation" bit set, meaning it may be the case that the reply was incomplete and unusable because it was too big for UDP. In addition, we got a resource record type we didn't expect (i.e., we wanted a TXT record, but something other than that came back).

Unless you configure opendkim to use unbound, we use your libc-provided resolver which might not automatically try again with TCP to get rid of the truncation. (We just call res_query() and expect it to do that for us.)

I'll test both of the examples you gave and see if I can reproduce the problem.

mskucherawy avatar Jul 24 '19 18:07 mskucherawy

I tried this on my dev machine and it works fine; I don't get the "tc" (truncation) bit set on my replies. I suspect the resolver is re-sending the request using TCP for me.

The default resolver interface in libc, res_query(), doesn't appear to switch to TCP by default for you. You might be able to enable this though with options in resolv.conf.

You could also do as suggested and try linking against libunbound.

mskucherawy avatar Jul 24 '19 19:07 mskucherawy

Sorry for a comment a year later, I had a similar thing occur. Running the test command gave me:

opendkim-testkey: using default configfile /etc/opendkim/opendkim.conf
opendkim-testkey: checking key '202006....'
opendkim-testkey: '202006._domainkey....' reply truncated
(removed domain name)
I am not getting an error, but no success. You mentioned libc. I am using alpine which uses musl. Is there a problem that using musl might cause?

ghost avatar Jun 22 '20 04:06 ghost

Hey there,

Apologies for a comment after even longer time. MUSL is not something we're testing heavily, but I think the answer is the same: when you send a request for a large DNS packet (long text records), the built in system resolver doesn't set a proper edns maximum packet size, and also doesn't know how to retry the query with TCP.

If someone can show me this problem when opendkim is built against libunbound, I'd like to investigate further, but reimplementing resolver logic that already exists is not where our efforts should be spend.

thegushi avatar Dec 29 '22 21:12 thegushi