PunycodeSwift icon indicating copy to clipboard operation
PunycodeSwift copied to clipboard

Ability to insert invalid Unicode character

Open maxrys opened this issue 1 year ago • 0 comments

https://github.com/gumob/PunycodeSwift/blob/30a462bdb4398ea835a3585472229e0d74b36ba5/Sources/Punycode.swift#L148

in this place of the code it is necessary to use, for example, such protection:

                guard Character(scalar).isLetter ||
                      Character(scalar).isNumber else {
                    return nil
                }

code for testing:

/* !!! It is assumed that each value starts with the prefix "xn--" but here it is omitted !!! */

let domains = [
    "90a"              : "б"            ,
    "--9sb"            : "б-"           ,
    "--btb"            : "-б"           ,
    "80acde"           : "абвг"         ,
    "abcd"             : "abcd"         ,
    "abcd-u8d"         : "ёabcd"        ,
    ""                 : ""             , // wrong by syntax (ascii part = n/a, codes part = n/a)
    "-"                : ""             , // wrong by syntax (ascii part = n/a, codes part = n/a)
    "y"                : ""             , // wrong by value
    "--"               : ""             , // wrong by syntax (ascii part = "-", codes part = n/a)
    "y-"               : ""             , // wrong by syntax (ascii part = "y", codes part = n/a)
    "-y"               : ""             , // wrong by syntax (ascii part = n/a, codes part = "y")
    "y-z"              : ""             , // wrong by value
    "abcd-v8d"         : "aёbcd"        ,
    "abcd-w8d"         : "abёcd"        ,
    "abcd-x8d"         : "abcёd"        ,
    "abcd-y8d"         : "abcdё"        ,
    "a-b-c-d"          : ""             , // wrong by value
    "a-b-c-d-lng"      : "ёa-b-c-d"     ,
    "a-b-c-d-mng"      : "aё-b-c-d"     ,
    "a-b-c-d-nng"      : "a-ёb-c-d"     ,
    "a-b-c-d-ong"      : "a-bё-c-d"     ,
    "a-b-c-d-png"      : "a-b-ёc-d"     ,
    "a-b-c-d-qng"      : "a-b-cё-d"     ,
    "a-b-c-d-rng"      : "a-b-c-ёd"     ,
    "a-b-c-d-sng"      : "a-b-c-dё"     ,
    "-a-b-c-d-vbhklm5q": "ёпрст-a-b-c-d",
    "a-b-c-d--3bhklm5q": "a-b-c-d-ёпрст",
    "r1aadaaghijkl"    : "ттуууфхцчшщ"  ,
    "bcher-kva"        : "bücher"       ,
    "80abnmycp7evc"    : "обращения"    ,
]

for domain in domains {
    let decodeValue = domain.key.punycodeDecoded ?? ""
    if (decodeValue) == domain.value {print("'\(domain.key)' → '\(decodeValue)' | OK")}
    if (decodeValue) != domain.value {print("'\(domain.key)' → '\(decodeValue)' | ERROR")}
}

// mass creation of invalid values ​​to attempt to throw an exception

var symbols = [
    "a", "k", "u", "0",
    "b", "l", "v", "1",
    "c", "m", "w", "2",
    "d", "n", "x", "3",
    "e", "o", "y", "4",
    "f", "p", "z", "5",
    "g", "q", "-", "6",
    "h", "r",      "7",
    "i", "s",      "8",
    "j", "t",      "9",
]

for i in 0...10000 {
    var word = ""
    for c in 0...Int.random(in: 3..<65) {
        word += symbols.randomElement()!
    }
    word.punycodeDecoded
}

print("test done")

report before patch:

'--'                → '-'             | ERROR
'--9sb'             → 'б-'            | OK
'--btb'             → '-б'            | OK
'-'                 → ''              | OK
'-a-b-c-d-vbhklm5q' → 'ёпрст-a-b-c-d' | OK
'-y'                → ''              | OK
''                  → ''              | OK
'80abnmycp7evc'     → 'обращения'     | OK
'80acde'            → 'абвг'          | OK
'90a'               → 'б'             | OK
'a-b-c-d--3bhklm5q' → 'a-b-c-d-ёпрст' | OK
'a-b-c-d-lng'       → 'ёa-b-c-d'      | OK
'a-b-c-d-mng'       → 'aё-b-c-d'      | OK
'a-b-c-d-nng'       → 'a-ёb-c-d'      | OK
'a-b-c-d-ong'       → 'a-bё-c-d'      | OK
'a-b-c-d-png'       → 'a-b-ёc-d'      | OK
'a-b-c-d-qng'       → 'a-b-cё-d'      | OK
'a-b-c-d-rng'       → 'a-b-c-ёd'      | OK
'a-b-c-d-sng'       → 'a-b-c-dё'      | OK
'a-b-c-d'           → 'a-b€-c' | ERROR
'abcd-u8d'          → 'ёabcd'         | OK
'abcd-v8d'          → 'aёbcd'         | OK
'abcd-w8d'          → 'abёcd'         | OK
'abcd-x8d'          → 'abcёd'         | OK
'abcd-y8d'          → 'abcdё'         | OK
'abcd'              → 'ƒ‚€' | ERROR
'bcher-kva'         → 'bücher'        | OK
'r1aadaaghijkl'     → 'ттуууфхцчшщ'   | OK
'y-'                → 'y'             | ERROR
'y-z'               → 'yŒ'     | ERROR
'y'                 → '˜'      | ERROR
test done

report after patch:

'--'                → '-'             | ERROR
'--9sb'             → 'б-'            | OK
'--btb'             → '-б'            | OK
'-'                 → ''              | OK
'-a-b-c-d-vbhklm5q' → 'ёпрст-a-b-c-d' | OK
'-y'                → ''              | OK
''                  → ''              | OK
'80abnmycp7evc'     → 'обращения'     | OK
'80acde'            → 'абвг'          | OK
'90a'               → 'б'             | OK
'a-b-c-d--3bhklm5q' → 'a-b-c-d-ёпрст' | OK
'a-b-c-d-lng'       → 'ёa-b-c-d'      | OK
'a-b-c-d-mng'       → 'aё-b-c-d'      | OK
'a-b-c-d-nng'       → 'a-ёb-c-d'      | OK
'a-b-c-d-ong'       → 'a-bё-c-d'      | OK
'a-b-c-d-png'       → 'a-b-ёc-d'      | OK
'a-b-c-d-qng'       → 'a-b-cё-d'      | OK
'a-b-c-d-rng'       → 'a-b-c-ёd'      | OK
'a-b-c-d-sng'       → 'a-b-c-dё'      | OK
'a-b-c-d'           → ''              | OK
'abcd-u8d'          → 'ёabcd'         | OK
'abcd-v8d'          → 'aёbcd'         | OK
'abcd-w8d'          → 'abёcd'         | OK
'abcd-x8d'          → 'abcёd'         | OK
'abcd-y8d'          → 'abcdё'         | OK
'abcd'              → ''              | ERROR
'bcher-kva'         → 'bücher'        | OK
'r1aadaaghijkl'     → 'ттуууфхцчшщ'   | OK
'y-'                → 'y'             | ERROR
'y-z'               → ''              | OK
'y'                 → ''              | OK
test done

p.s. as the author I am not against adding this test to the source code

maxrys avatar Sep 07 '24 10:09 maxrys