playwright icon indicating copy to clipboard operation
playwright copied to clipboard

[BUG] Playwright reports status code 304 as 200

Open JannisBush opened this issue 2 years ago • 2 comments

System info

  • Playwright Version: [v1.30]
  • Operating System: [All]
  • Browser: [Chromium, WebKit?]
  • Other info:

Source code

  • [x] I provided exact source code that allows reproducing the issue locally.

Test file (self-contained)

from playwright.sync_api import sync_playwright

def response_handler(response):
        if response.request.url == "https://subscribe.manoramaonline.com/content/subscription/subscriptionorderdetails.subscription.TR.digital.html":
            print(response)
            print(response.status)
            print(response.headers_array())

def main():
    with sync_playwright() as p:
        browser = p.chromium.launch(headless=False) # 200, 200 -> incorrect
        #browser = p.firefox.launch(headless=False) # 200, 304 -> correct
        #browser = p.webkit.launch(headless=False)  # 200, stuck after opening devtools?; without opening devtools 2x 200

        page = browser.new_page()        
        page.on('response', response_handler)
        while True:
            page.goto('https://subscribe.manoramaonline.com/content/subscription/subscriptionorderdetails.subscription.TR.digital.html')
            input("Open the network tab in devtools, then press enter.")

if __name__ == "__main__":
    main()

Steps

  • Run the test
  • Open the devtools in the browser and switch to the network tab
  • Press enter in the console

Expected

  • The first response should have status code 200, the second one should have status code 304.

Actual

  • Playwright (Chromium) reports status code 200 twice.
  • Devtools shows 304 for the second response.

Output:

❯ poetry run python test_304_playwright.py
<Response url='https://subscribe.manoramaonline.com/content/subscription/subscriptionorderdetails.subscription.TR.digital.html' request=<Request url='https://subscribe.manoramaonline.com/content/subscription/subscriptionorderdetails.subscription.TR.digital.html' method='GET'>>
200
[{'name': 'accept-ranges', 'value': 'bytes'}, {'name': 'cache-control', 'value': 'max-age=0, no-cache'}, {'name': 'content-encoding', 'value': 'gzip'}, {'name': 'content-length', 'value': '3266'}, {'name': 'content-type', 'value': 'text/html'}, {'name': 'date', 'value': 'Thu, 12 Oct 2023 14:18:43 GMT'}, {'name': 'etag', 'value': 'W/"30aa-604449427945c"'}, {'name': 'expires', 'value': 'Thu, 12 Oct 2023 14:18:43 GMT'}, {'name': 'last-modified', 'value': 'Fri, 01 Sep 2023 04:27:30 GMT'}, {'name': 'pragma', 'value': 'no-cache'}, {'name': 'server', 'value': 'Apache/2.4.27 (Amazon) Communique/4.2.1'}, {'name': 'vary', 'value': 'Accept-Encoding'}, {'name': 'x-frame-options', 'value': 'SAMEORIGIN'}]
Open the network tab in devtools, then press enter.
<Response url='https://subscribe.manoramaonline.com/content/subscription/subscriptionorderdetails.subscription.TR.digital.html' request=<Request url='https://subscribe.manoramaonline.com/content/subscription/subscriptionorderdetails.subscription.TR.digital.html' method='GET'>>
200
[{'name': 'cache-control', 'value': 'max-age=0, no-cache'}, {'name': 'content-type', 'value': 'text/html'}, {'name': 'date', 'value': 'Thu, 12 Oct 2023 14:18:53 GMT'}, {'name': 'etag', 'value': 'W/"30aa-604449427945c"'}, {'name': 'expires', 'value': 'Thu, 12 Oct 2023 14:18:53 GMT'}, {'name': 'last-modified', 'value': 'Fri, 01 Sep 2023 04:27:30 GMT'}, {'name': 'pragma', 'value': 'no-cache'}]
Open the network tab in devtools, then press enter.

Screenshot: Screenshot 2023-10-12 at 16 06 36

JannisBush avatar Oct 12 '23 14:10 JannisBush

I can repro. Investigation notes:

  • It seems like without Playwright the Cache-Control: max-age=0 header is sent on the second request, and with Playwright it is not sent.
  • Removing custom chromium switches does not fix it.
  • Disabling most protocol commands like emulation, utility scripts, etc. does not fix it.
  • Launching persistent context does not fix it.
  • Overall, I was not able to find a meaningful difference between Playwright's CDP traffic and DevTools' CDP traffic that would affect this, but repro is 100% reliable.

dgozman avatar Oct 12 '23 20:10 dgozman

👀 Also ran into this, where we had a bunch of waitForResponse calls using resp.ok() and they were failing in Firefox only, evidently due to Firefox exposing the 304 status rather than the cached 200 status.

This issue appears to be filed in the context of wanting the 304 status to be exposed everywhere, but I would argue that really it should expose the cached status, as that is what the browser will be seeing in essence (and would fix the rather big gotcha of resp.ok() not working as one might expect it to).

MattIPv4 avatar Mar 19 '24 14:03 MattIPv4