dnscontrol icon indicating copy to clipboard operation
dnscontrol copied to clipboard

IDNA proposal

Open tlimoncelli opened this issue 1 year ago • 19 comments

tlimoncelli avatar Mar 18 '24 00:03 tlimoncelli

Mentioning a number of people that I suspect have IDNA domains or knowledge about how IDNA works (based on skimmy bugs and PRs)

@adamus1red @dkim1970 @flz @juliusrickert @killerbees19 @Kusado @louis-lau @masterzen @mderriey @pmoroney @tresni @Yannik

I'd love to receive feedback about this proposal!

tlimoncelli avatar Mar 21 '24 13:03 tlimoncelli

@tlimoncelli

I think your example for the asci==unicode exampe might be wrong:

#7: + CREATE foo.example.com MX 10 xn--p1ai.com (рф.com) (ttl=14400)

Was this what you intended?

Yannik avatar Mar 21 '24 14:03 Yannik

I noticed you created both CREATE and MODIFY examples for ascii (unicode) and unicode (ascii), but only MODIFY examples for ascii and unicode. How is that to be understood?

Yannik avatar Mar 21 '24 14:03 Yannik

@tlimoncelli

I think your example for the asci==unicode exampe might be wrong:

#7: + CREATE foo.example.com MX 10 xn--p1ai.com (рф.com) (ttl=14400)

Was this what you intended?

Ah, good point!

The first one is where the label is ascii==unicode but the target is ascii!=unicode. I'll update the comment.

Thanks for finding that!

Tom

tlimoncelli avatar Mar 21 '24 14:03 tlimoncelli

I noticed you created both CREATE and MODIFY examples for ascii (unicode) and unicode (ascii), but only MODIFY examples for ascii and unicode. How is that to be understood?

I've added more examples. I don't think I've covered every combination, but my goal is to show typical examples not every possible example.

I've also added examples where we use {} and ⟬⟭ and ❮❯. I think using unicode chars to highlight unicode domains would be cool (maybe too clever?).

tlimoncelli avatar Mar 21 '24 15:03 tlimoncelli

While I'm not against having the ASCII and UTF on the output lines, I do worry it might make the output too busy. Wouldn't simply using the .Name value be a better since it should then pretty much match what is in the dnscontrol configuration?

adamus1red avatar Mar 21 '24 19:03 adamus1red

LGTM

pmoroney avatar Mar 21 '24 20:03 pmoroney

@adamus1red wrote:

While I'm not against having the ASCII and UTF on the output lines, I do worry it might make the output too busy. Wouldn't simply using the .Name value be a better since it should then match what is in the dnscontrol configuration?

That's an interesting point! I guess my thought is that showing both versions helps with debugging.

tlimoncelli avatar Mar 21 '24 21:03 tlimoncelli

That's an interesting point! I guess my thought is that showing both versions helps with debugging.

@tlimoncelli maybe a compromise would be if the output was the same as what the DNS provider or Registrar used.

I know I've had issues where the DNS is using UTF but the registrar is using ASCII. I.e. namecheap uses ascii, so for registrar stuff using namecheap use ascii punycode and the DNS is cloudflare which uses UTF, so the output uses UTF.

adamus1red avatar Mar 21 '24 21:03 adamus1red

The only IDNA domain I have is for fun, so I don't have a strong preference. I'll give my input nonetheless :). If you want to show both, I think I like B better, as it feels more consistent to me. Anything not in brackets will always be ASCII that way.

I'd probably go with showing what the original user input was, with a flag to only show ASCII if needed. It's less information to parse, and the user should be familiar with it as that's the way it's listed in their config. I could see points being made for showing both, but I've always liked things more distraction free and less dense.

I think the Unicode brackets are a little too clever, perhaps even a little confusing ;)

louis-lau avatar Mar 21 '24 21:03 louis-lau

First of all, improving IDNA handling would be a great improvement to dnscontrol.

Regarding output, the one thing I definitely do not like is having the ascii output come first, because it is the one least likely to be understood/mentally associated with the relevant domain.

I think simply using the original user input has merit, pairing that with a toggle to additionally show ascii seems fine to me.
However, I also wouldn't mind the unicode (ascii) output.

Yannik avatar Mar 23 '24 15:03 Yannik

First of all, improving IDNA handling would be a great improvement to dnscontrol.

Regarding output, the one thing I definitely do not like is having the ascii output come first, because it is the one least likely to be understood/mentally associated with the relevant domain.

I think simply using the original user input has merit, pairing that with a toggle to additionally show ascii seems fine to me. However, I also wouldn't mind the unicode (ascii) output.

I'm seconding this suggestion, by displaying the "human readable" format I think the barrier for using IDN's with dnscontrol is getting lowered.

Because the IDNA format is not human readable, especially when it comes to non-latinized languages.

dkim1970 avatar Mar 23 '24 15:03 dkim1970

This is excellent feedback! It's getting me excited!

Question: In what situations would people want to see something besides the .Name (the user input) version?

tlimoncelli avatar Mar 23 '24 16:03 tlimoncelli

Question: In what situations would people want to see something besides the .Name (the user input) version?

What about if the registrar or dns provider use something different than the .Name value, then include the version they are using in brackets?

adamus1red avatar Mar 23 '24 18:03 adamus1red

Personally, I think whatever the dns provider does isn't relevant to the cli output. Behind the scenes at every provider, it's all punycode anyway.

louis-lau avatar Mar 23 '24 20:03 louis-lau

Personally, I think whatever the dns provider does isn't relevant to the cli output. Behind the scenes at every provider, it's all punycode anyway.

Agreed, having the display format handled outside of the provider is to be preferred IMO.

Yannik avatar Mar 23 '24 20:03 Yannik

I don't have experience with IDNA at all. My $0.02: I do agree that showing the ascii version is useful only for debugging, what users want to see is if their unicode domain (or however it was entered in dnsconfig.js) is being processed and how.

masterzen avatar Mar 24 '24 12:03 masterzen

Hi folks!

2 ideas:

Support multiple formats?

There's been a lot of discussion about ascii (unicode) vs unicode (ascii). It might be possible to add a command line flag that selected the format. No promises, but it might be possible. In that case, I'd recommend the default be userinput and add a flag for debugging that shows unicode (ascii) or userinput (ascii).

I'll know more if this is possible when I start coding.

An idea that would break less existing code

Existing code expects .Name to be ASCII (the current code runs dc.Punycode() for all providers, which rewrites .Name to be ASCII). Rather than require every use of .Name to change to .NameASCII, maybe the names should be: .Name (ASCII, to be compatible with old code), .NameORIG (how the user input the string), .NameUNICODE, and .NameDisplay.

tlimoncelli avatar Mar 25 '24 13:03 tlimoncelli

Also, IDN isn't IDN if we compare .de and .com. Some TLD Providers support different IDNA Standards (IDNA2003 vs. IDNA2008, UTS46). Translating an IDN might by that end in a different punycode variant.

Let me provide some example in here from the HEXONET Provider's ConvertIDN API Command:

[COMMAND]
COMMAND = ConvertIDN
DOMAIN0 = ärzte.com
DOMAIN1 = ärzte.de
EOF

[RESPONSE]
CODE = 200
DESCRIPTION = Command completed successfully
PROPERTY[ACE][0] = xn--rzte-koa.com
PROPERTY[ACE][1] = xn--rzte-koa.de
PROPERTY[IDN][0] = ärzte.com
PROPERTY[IDN][1] = ärzte.de
EOF

No big difference in here. But let us pick one with german special characters:

[COMMAND]
COMMAND = ConvertIDN
DOMAIN0 = fußball.com
DOMAIN1 = fußball.de
EOF

[RESPONSE]
CODE = 200
DESCRIPTION = Command completed successfully
PROPERTY[ACE][0] = fussball.com
PROPERTY[ACE][1] = xn--fuball-cta.de
PROPERTY[IDN][0] = fussball.com
PROPERTY[IDN][1] = fußball.de
EOF

Let us ignore that .com is covering that differently and let us use the punycode variant returned for .de as a .com domain name xn--fuball-cta.com. While the IDN translation is from technical perspective correct, it won't work together with the TLD Provider as of a different supported IDNA Standard. By that, I highly think that this needs to be considered as well when going for a IDNA proposal. The above API Command is mapping the response to a working variant.

Mhmm... DNSControl again runs on "existing" data configured by the user. By that, the input should be considered as "correct" (would be very stupid otherwise) and by that, we can consider this special discussion probably as superfluous... Or should DNSControl then exit with an error in case a potential IDN Precheck fails?

Mhmm 2 ... The DNS/Domain Provider should finally be capable of handling that on their own (returning an error message static out that the provided domain/dnszone name is invalid) and you guys do not have to worry about all that. Sorry that I bumped this up :-)

KaiSchwarz-cnic avatar May 12 '25 08:05 KaiSchwarz-cnic

Thanks for all the feedback!

After writing a bunch of code to implement this for zone names (but not yet the labels of individual records), it's quite clear that NameDisplay is not needed. There's only one place that uses it, and we can just generate the right format at that time.

DomainConfig now stores:

  • .Name: I would call this .NameASCII as recommended, but that would break a lot of code. Maybe some day we'll have "a great renaming"?
  • .NameRaw: The domain name as the user input it in dnsconfig.js, (passed through ToLower)
  • .NameUnicode: The domain passed through strings.ToLower then idna.ToUnicode().

I don't know if the way I call ToLower will break things. Actually, now that I write this I think it will. Maybe I should do this instead:

  • .Name: call idna.ToASCII() then strings.ToLower()
  • .NameRaw: The domain name as the user input it in dnsconfig.js, with no changes (so far no code uses this. Maybe that's a sign?)
  • .NameUnicode: call idna.ToASCII() then strings.ToLower() then idna.ToUnicode()

tlimoncelli avatar Dec 02 '25 15:12 tlimoncelli

https://github.com/StackExchange/dnscontrol/pull/3879 fixes the problems and, thanks for all of your suggestions, just works a lot better.

tlimoncelli avatar Dec 02 '25 22:12 tlimoncelli