nyum icon indicating copy to clipboard operation
nyum copied to clipboard

build.sh: transliterate categories

Open backlin opened this issue 1 year ago • 1 comments

_templates/technical/faux_urlencode.awk can't handle multibyte characters, such as {å, ä, ö}:

echo 'abzåäöxy' | awk -f "_templates/technical/faux_urlencode.awk"
awk: towc: multibyte conversion failure on: '�

 input record number 1, file
 source line number 18

I suggest to avoid this problem by ascii conversion with transliteration, where non-ascii chars are converted to their closest ascii representation.

Example:

echo 'abzåäöxy' | iconv -f UTF-8 -t ascii//TRANSLIT | awk -f "_templates/technical/faux_urlencode.awk"
abza22a22oxy

It's not ideal to convert ä to 2a since the URLs look a little awkward, but it fixes the build and the html looks fine.

backlin avatar Aug 27 '23 07:08 backlin

Opening this because {å, ä, æ, ö, ø} are common characters in the Nordic languages, so the category system must support them to work in those languages.

backlin avatar Aug 27 '23 07:08 backlin