async-validator icon indicating copy to clipboard operation
async-validator copied to clipboard

`type: url` RegExp validation is quadratic

Open PaulJuliusMartinez opened this issue 3 years ago • 1 comments

The regex used for URL validation shows quadratic behavior with unusual, but not necessarily pathological, inputs:

let re = new RegExp(
  '^(?!mailto:)(?:(?:http|https|ftp)://|//)(?:\\S+(?::\\S*)?@)?(?:(?:(?:[1-9]\\d?|1\\d\\d|2[01]\\d|22[0-3])(?:\\.(?:1?\\d{1,2}|2[0-4]\\d|25[0-5])){2}(?:\\.(?:[0-9]\\d?|1\\d\\d|2[0-4]\\d|25[0-4]))|(?:(?:[a-z\\u00a1-\\uffff0-9]+-*)*[a-z\\u00a1-\\uffff0-9]+)(?:\\.(?:[a-z\\u00a1-\\uffff0-9]+-*)*[a-z\\u00a1-\\uffff0-9]+)*(?:\\.(?:[a-z\\u00a1-\\uffff]{2,})))|localhost)(?::\\d{2,5})?(?:(/|\\?|#)[^\\s]*)?$',
  'i',
)

let maybeUrl = "https://"
for (let i = 0; i < 32; i++) {
  maybeUrl = maybeUrl + 'a'
  let start = new Date().getTime()
  maybeUrl.match(re)
  let end = new Date().getTime()
  console.log(maybeUrl.padStart(40, " "), String(end - start).padStart(10, " "))
}

Outputs:

                                     ...
               https://aaaaaaaaaaaaaaaaa          1
              https://aaaaaaaaaaaaaaaaaa          3
             https://aaaaaaaaaaaaaaaaaaa          7
            https://aaaaaaaaaaaaaaaaaaaa         12
           https://aaaaaaaaaaaaaaaaaaaaa         25
          https://aaaaaaaaaaaaaaaaaaaaaa         51
         https://aaaaaaaaaaaaaaaaaaaaaaa         98
        https://aaaaaaaaaaaaaaaaaaaaaaaa        191
       https://aaaaaaaaaaaaaaaaaaaaaaaaa        385
      https://aaaaaaaaaaaaaaaaaaaaaaaaaa        784
     https://aaaaaaaaaaaaaaaaaaaaaaaaaaa       1549
    https://aaaaaaaaaaaaaaaaaaaaaaaaaaaa       3138
   https://aaaaaaaaaaaaaaaaaaaaaaaaaaaaa       6285
  https://aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa      12427
 https://aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa      24510

Expected output:

The time to match the URL regular expression should be linear in the size of the input. I am unsure whether this can be resolved by rewriting the regular expression differently.

PaulJuliusMartinez avatar Jan 20 '22 00:01 PaulJuliusMartinez

An alternative approach would be to call new URL(input) and catch any errors, rather than writing and maintaining a custom regular expression.

PaulJuliusMartinez avatar Jan 24 '22 22:01 PaulJuliusMartinez