addressable icon indicating copy to clipboard operation
addressable copied to clipboard

Add test cases for IDNA2003 vs IDNA2008

Open sporkmonger opened this issue 9 years ago • 5 comments

Per http://www.unicode.org/reports/tr46/ – make sure Addressable supports IDNA2008 and add tests on major edge cases between IDNA2003 and IDNA2008.

sporkmonger avatar Nov 03 '16 20:11 sporkmonger

Looks like Addressable, when used with libidn (via idn-ruby gem), follows IDNA 2003 (as stated by libidn website) but when libidn (idn-ruby gem) isn't available, and we fall back to the "pure" version, IDNA 2008 looks to be used.

# native (libidn)
irb(main):001:0> Addressable::URI.parse("http://faß.de").normalize
=> #<Addressable::URI:0xf78 URI:http://fass.de/>

# pure
irb(main):001:0> Addressable::URI.parse("http://faß.de").normalize
=> #<Addressable::URI:0x13d8 URI:http://xn--fa-hia.de/>

Full repro:

arm64 $ docker run -it --rm ruby:3.1.3 bash
root@2a71c4abec4a:/# gem install addressable
Fetching public_suffix-5.0.1.gem
Fetching addressable-2.8.1.gem
Successfully installed public_suffix-5.0.1
Successfully installed addressable-2.8.1
2 gems installed
root@2a71c4abec4a:/# irb -raddressable/uri
irb(main):001:0> Addressable::URI.parse("http://faß.de").normalize
=> #<Addressable::URI:0x13d8 URI:http://xn--fa-hia.de/>
irb(main):002:0>
root@2a71c4abec4a:/# apt-get update && apt-get install -y libidn11-dev
Get:1 http://deb.debian.org/debian bullseye InRelease [116 kB]
Get:2 http://deb.debian.org/debian-security bullseye-security InRelease [48.4 kB]
Get:3 http://deb.debian.org/debian bullseye-updates InRelease [44.1 kB]
Get:4 http://deb.debian.org/debian bullseye/main arm64 Packages [8072 kB]
Get:5 http://deb.debian.org/debian-security bullseye-security/main arm64 Packages [218 kB]
Get:6 http://deb.debian.org/debian bullseye-updates/main arm64 Packages [12.0 kB]
Fetched 8510 kB in 1s (7654 kB/s)
Reading package lists... Done
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  libidn11
The following NEW packages will be installed:
  libidn11 libidn11-dev
0 upgraded, 2 newly installed, 0 to remove and 10 not upgraded.
Need to get 708 kB of archives.
After this operation, 1212 kB of additional disk space will be used.
Get:1 http://deb.debian.org/debian bullseye/main arm64 libidn11 arm64 1.33-3 [115 kB]
Get:2 http://deb.debian.org/debian bullseye/main arm64 libidn11-dev arm64 1.33-3 [593 kB]
Fetched 708 kB in 0s (4362 kB/s)
debconf: delaying package configuration, since apt-utils is not installed
Selecting previously unselected package libidn11:arm64.
(Reading database ... 22781 files and directories currently installed.)
Preparing to unpack .../libidn11_1.33-3_arm64.deb ...
Unpacking libidn11:arm64 (1.33-3) ...
Selecting previously unselected package libidn11-dev:arm64.
Preparing to unpack .../libidn11-dev_1.33-3_arm64.deb ...
Unpacking libidn11-dev:arm64 (1.33-3) ...
Setting up libidn11:arm64 (1.33-3) ...
Setting up libidn11-dev:arm64 (1.33-3) ...
Processing triggers for libc-bin (2.31-13+deb11u5) ...
root@2a71c4abec4a:/# gem install idn-ruby
Fetching idn-ruby-0.1.5.gem
Building native extensions. This could take a while...
Successfully installed idn-ruby-0.1.5
1 gem installed
root@2a71c4abec4a:/# irb -raddressable/uri
irb(main):001:0> Addressable::URI.parse("http://faß.de").normalize
=> #<Addressable::URI:0xf78 URI:http://fass.de/>

So for Addressable to fully support IDNA 2008 it would have to use Libidn2 somehow? 🤔

dentarg avatar Feb 01 '23 21:02 dentarg

As discussed in https://github.com/sporkmonger/addressable/issues/408#issuecomment-1421066788 I believe that yes we should add support for libidn2. I looked for options and found:

  1. https://github.com/hfm/idn2-ruby: simply going from idn to idn2, but it's actually an anbandonned empty shell.
  2. https://github.com/ogom/ruby-idna: a maintained wrapper dynamically linked using ffi (no static C extension to compile), looks like a good option to me
  3. https://github.com/HoneyryderChuck/idnx: provides an ffi wrapper to libidn2, a native Windows API version, and a pure Ruby version (IDNA2003). Might be interesting if we want to offload this part entirely but in this case I'll have to get the pure ruby implementation up to standard (#491). It's Apache licenced, not sure if that could be an issue.
  4. Alternatively as the FFI wrappers are not too complex, we could probably ditch the dependency, inline the wrapper code and just depend on ffi.

Even though it's not very DRY I would lean toward option 4 in order to reduce exposure to dependency issues (project abandonned, hacked, gem version incompatibilities, etc...)

@sporkmonger @dentarg what do you think?

jarthod avatar Feb 07 '23 21:02 jarthod

I agree with you about 4 and to reduce exposure to dependency issues.

At a glance https://github.com/ogom/ruby-idna looked well-maintained (it was recently updated) but then I noticed it was almost 6 years between versions.

dentarg avatar Feb 07 '23 23:02 dentarg

And thank you for doing this research. ❤️

dentarg avatar Feb 07 '23 23:02 dentarg

PR is here: #496

jarthod avatar Mar 22 '23 22:03 jarthod