playwright icon indicating copy to clipboard operation
playwright copied to clipboard

[BUG] WebKit on Linux includes Content-Length: 0 on GET requests

Open jperl opened this issue 1 year ago • 3 comments

System info

  • Playwright Version: v1.32.0
  • Operating System: Ubuntu
  • Browser: WebKit
  • Other info: mcr.microsoft.com/playwright:v1.32.0-focal docker image

Source code

  • [x] I provided exact source code that allows reproducing the issue locally.
const { webkit } = require('playwright');

(async () => {
  const browser = await webkit.launch({ headless:false, devtools: true });
  const page = await browser.newPage();

  page.on('request', data => {
    data.allHeaders().then(headers => {
      console.log("content-length", headers['content-length']);
    });
  });

  await page.evaluate(() => {
    fetch("http://httpbin.org/get", {
      // this header will cause WebKit Linux to include a request header with Content-Length: 0
      "headers": { "Content-Type": "application/json" },
      "method": "GET"
    });
  });
})();

Steps

  • Run the script on Linux and Mac
  • Compare the logs
  • See on Linux it logs 0 (unexpected) vs Mac it logs undefined (expected).

Expected

On Linux WebKit, the content-length header should not be included for GET requests -- the same way it works on Mac WebKit.

This would align with rfc 7230 Screenshot 2023-04-21 at 5 43 21 PM

Actual

Only on Linux WebKit, a content-length 0 request header is included. This causes requests on websites to fail when accessing websites that protect against HTTP Request Smuggling attacks.

For example, any AWS ALB that has "Desync mitigation mode" set to Strictest will reject requests.

Screenshot 2023-04-21 at 6 56 10 PM

jperl avatar Apr 22 '23 01:04 jperl

Looks like a libsoup2 bug to me:

// Compile using:
// gcc test.c -o test `pkg-config --cflags --libs libsoup-2.4`
#include <libsoup/soup.h>
#include <stdio.h>

int main(int argc, char *argv[]) {
    SoupSession *session;
    SoupMessage *msg;
    SoupURI *uri;
    const char *url;
    GError *error = NULL;

    url = "https://httpbin.org/headers";

    session = soup_session_new();
    msg = soup_message_new("GET", url);
    uri = soup_uri_new(url);

	soup_message_headers_append (msg->request_headers, "Content-Type", "kektus");

    soup_message_set_flags(msg, SOUP_MESSAGE_NO_REDIRECT);
    soup_message_set_uri(msg, uri);

    if (!soup_session_send_message(session, msg)) {
        fprintf(stderr, "Error sending request: %s\n", error->message);
        g_clear_error(&error);
        return 1;
    }

    printf("Status code: %d\n", msg->status_code);
    printf("%s", msg->response_body->data);

    soup_uri_free(uri);

    return 0;
}

which ends up in

Status code: 200
{
  "headers": {
    "Accept-Encoding": "gzip, deflate", 
    "Content-Length": "0", 
    "Content-Type": "kektus", 
    "Host": "httpbin.org", 
    "X-Amzn-Trace-Id": "Root=1-64440e2d-43565cee4f9bc7233912ae54"
  }
}

with libsoup3 its fine:

// Compile with:
// cc test.c -o test `pkg-config --cflags --libs libsoup-3.0`
#include <libsoup/soup.h>
#include <stdio.h>

int main(int argc, char *argv[])
{
        SoupSession * session;
        SoupMessage * msg;
        GUri * uri;
        const char *url;
        GError *error = NULL;

        GBytes * body;
        url = "https://httpbin.org/headers";

        session = soup_session_new();

        msg = soup_message_new("GET", url);
	    soup_message_headers_append (soup_message_get_request_headers(msg), "Content-Type", "kektus");
        uri = g_uri_parse(url, SOUP_HTTP_URI_FLAGS, &error);
        if (error != NULL)
        {
                g_printerr("Could not parse '%s' as a URL: %s\n", url, error->message);
                exit(1);
        }

        soup_message_set_flags(msg, SOUP_MESSAGE_NO_REDIRECT);
        soup_message_set_uri(msg, uri);
        body = soup_session_send_and_read(session, msg, NULL, &error);
        if (body == NULL)
        {
                g_printerr("Failed to contact HTTP server: %s\n", error->message);
                return 1;
        }

        char *text;
        text = g_strndup(g_bytes_get_data(body, NULL), g_bytes_get_size(body));
        g_printerr("%s\n", text);

        return 0;
}

mxschmitt avatar Apr 22 '23 16:04 mxschmitt

We should evaluate if it makes sense for us to move to soup3 instead of soup2. They upstream use completely soup3 already since 2 years. We could use libsoup2 on Ubuntu20 and libsoup3 on Ubuntu22.

mxschmitt avatar Apr 22 '23 16:04 mxschmitt

Update: We decided to migrate to libsoup3 where its possible. For Ubuntu22+Debian11 this would mean libsoup3. This issue will keep track the progress of the migration.

mxschmitt avatar May 02 '23 13:05 mxschmitt

is libsoup3 migration done in v.137? https://github.com/microsoft/playwright/issues/23259 is still not working

sylhero avatar Aug 16 '23 14:08 sylhero