playwright [Question] How to handle subsequent GET requests returning server state during HAR replay?

Hello there! I have a question about HAR replay and subsequent (matching) GET requests.

Some details:

Context:

Playwright Version: 1.27.1
Operating System: macOS 12.5.1
Node.js version: 16.15.0
Browser: All
Extra: -

Code Snippet

Using npx playwright codegen I captured a test scenario + matching HAR file
When replaying, I noticed the same subsequent GET requests always only return the first match form the HAR file
This will be wrong, if a UI interaction modifies the server state
Example:

import path from "path";
import { test, expect } from "@playwright/test";

test("test", async ({ page }) => {
	await page.routeFromHAR(
		path.resolve(process.cwd(), "tests", "scenario.har"),
		{
			url: "https://api.local.dev/**",
		}
	);

	// When loading `/`, some client side javascript fetches from the api.local.dev
	// First GET returns initial value `{text: "Hello"}`
	await page.goto("/");
	await expect(page.locator(".message")).toHaveText("Hello");

	// Some POST requests modifying server state, setting `text = "World"`
	await page.getByRole("button", { name: "Submit" }).click();

	// Second GET still returns initial value `{text: "Hello"}`
	// I'd expect `{text: "World"}` though
	await page.goto("/");
	await expect(page.locator(".message")).toHaveText("World"); // ❌ This fails
});

The current behavior is in line with the documentation (matching headers etc, see https://playwright.dev/docs/network#replaying-from-har) but I find it weird to model a real application scenario with this
I believe the issue (?) lies within the way playwright decides which request to match here: https://github.com/microsoft/playwright/blob/4ed2a01d9c4a372b7aee7931d32003261cb0351b/packages/playwright-core/src/server/dispatchers/localUtilsDispatcher.ts#L282-L291
How do I get playwright to return the "next" GET response from the HAR file? I'm not sure if it would be within the semantics of HAR replay to only ever return a HAR log once (hence removing it once returned)?

Thank you for any pointers!

Oct 24 '22 11:10 leomelzer

Hi @leomelzer! I was running into the same problem, and found this issue.

I've figured out a way to make this work - although I'd love to see the Playwright team come up with a recommended approach.

Rather than having a single HAR file for the entire test, I create multiple HARs by creating "checkpoints". Here's a simplified version of my solution:


async function harCheckpoint(id){
  const isRecordMode = process.env.RECORD
  await page.routeFromHAR(`my-test-checkpoint-${id}.zip`, { update: isRecordMode })
}


await harCheckpoint('test-start')

// Run your test actions, assertions as usual...

// ...and then create a new "checkpoint" right before taking the action that will mutate state
await harCheckpoint('before-first-mutation')

// Take the action that mutates state
await page.getByRole('button', { name: "Button that mutates server state" })

// Any requests served from here on out should have the new state since we're no longer using the initial HAR file.

Hopefully this conveys the idea.

Nov 22 '22 18:11 olivierbeaulieu

Thanks for sharing @olivierbeaulieu! I think this looks like a good work-around.

Skimming the HAR spec it looks like we can actually rely on the (sorted) entries, hence the Playwright code could be made more intelligent to dismiss previously hit / visited entries (instead of just taking the first one).

But that really feels like a breaking change also :) Happy about more input!

Nov 22 '22 19:11 leomelzer

I'm having the same issue although, in my experience, it looks like I always get the last result from the request that is done twice.

I've tried this checkpoint-idea but it does not work. If I compare the the .har content it points to the same files that are written next to it. I've also tried with a .zip and the same thing happens. I've checked which code is ran internally and it looks like there is no actual "split" between the routeFromHAR calls but it looks like it just adds more files to write to.

I agree @leomelzer that implementing that sorting/order would be great. Maybe add it as an option so it does not break any previously written tests for users?

Jan 17 '23 13:01 drieslamberechts

Update: I've found a way to simplify the checkpoint approach to avoid having multiple HAR archives. Makes it more reliable than the initial solution I posted previously.

Since Playwright decides which response to serve by counting the number of matching headers, you can simply tip the scale in the direction you want by intercepting requests at record time, adding headers that capture the context of your app. In my case, adding the current pathname + checkpoint ID was enough to make Playwright choose the right response from a set of calls to the same endpoint.

Create an automatic fixture:

harCheckpoint: [
    async function ({ context, page }, use) {
      let checkpointIndex = 0

      // You may not need this. Depends what you want to record/replay on.
      const allExceptLocal = /^(?!http?:\/\/localhost:3000).*$/

      await context.route(allExceptLocal, (route, request) => {
        const headers = {
          ...request.headers(),
          'X-Playwright-Checkpoint': `${checkpointIndex}`,
          'X-Playwright-Pathname': new URL(page.url()).pathname,
        }
        route.fallback({ headers })
      })

      await use(async () => {
        checkpointIndex += 1
      })
    },
    { auto: true, scope: 'test' },
  ],

And usage is something like:

myFixture(({ harCheckpoint }), () => {
  // Run your test actions, assertions as usual...

  // ...and then create a new "checkpoint" right before taking the action that will mutate state
  await harCheckpoint()

  // Take the action that mutates state
  await page.getByRole('button', { name: "Button that mutates server state" })
})

Much simpler imo.

Feb 14 '23 23:02 olivierbeaulieu

Very clever approach @olivierbeaulieu -- nice.

I wonder how much trouble it would be to formalize the same approximate "sequencing" logic into the actual PW feature code.

I had previously considered that it might be possible to solve this issue, using some sort of timestamp-based logic, but looking at your implementation, I bet it could be much simpler.

Figured I'd share a report from my app, in case it's useful.

Case Report

Overview of Test

testing an "add address" workflow, powered by an addAddress mutation
test case: it('errors when trying to add duplicate address')
addAddress round 1 = success
addAddress round 2 = error, duplicate address
assert error message shows

Problem When running this test in replay mode, PW replays the first mutation's har for both transactions. So the error is never triggered.

Workaround Came up with a registerSecondaryHarRecoding utility similar to others in this thread.

Related Consideration I think this highlights a shortcoming in the API design around "idempotency" which is sometimes addressed using "nonces"

Mar 10 '23 00:03 leggomuhgreggo

Hi. I face the same problem and proposed solution sounds great. But, I am not sure how to use it. I see that some code has been merged but I cannot find any related documentation.

Mar 04 '24 10:03 phramusca

Hi. I face the same problem and proposed solution sounds great. But, I am not sure how to use it. I see that some code has been merged but I cannot find any related documentation.

With this code, I have the headers added to my har file (with update=true), but when I replay test with update=false, the headers are added to all EXCEPT to the routes I want them to be added (the ones with HAR_URL regex) :(

(I used /./ instead of HAR_URL in harCheckpoint just for logging)

Any idea of what can be wrong ? @olivierbeaulieu please :)

import { HAR_URL } from '../../playwright.config';
import { test as base } from 'playwright-bdd';

export const test = base.extend<{ harCheckpoint: () => Promise<void> }>({
    harCheckpoint: [
        async function ({ context, page }, use) {
            let checkpointIndex = 0;

            console.log("Checkpoint: " + page.url());

            await context.route(/./, (route, request) => {
            // await context.route(HAR_URL, (route, request) => {
                const headers = {
                    ...request.headers(),
                    'My-Playwright-Checkpoint': `${checkpointIndex}`,
                    'My-Playwright-Page': page.url(),
                    'My-Playwright-Request': request.url()
                };
                console.log("Adding headers for request "+request.url()+" with checkpointIndex = " + checkpointIndex);
                route.fallback({ headers });
            });

            async function harCheckpoint() {
                console.log("harCheckpoint: " + page.url() + "; checkpointIndex=" + checkpointIndex + "+1");
                checkpointIndex += 1;
            }
            await use(harCheckpoint);
        },
        { auto: true, scope: 'test' },
    ],
});
export { expect } from '@playwright/test';

Here is my method to routeFromHar, that I call at test start:

import path from 'path';
import { HAR_URL, HAR_UPDATE } from '../../playwright.config';

const commonHarDirectory = './playwright/data/har/';

export async function routeFromHar(page: { routeFromHAR: (arg0: string, arg1: any) => any; }, relativeHarPath: string, options: any = {}) {

  const harPath = path.join(commonHarDirectory, relativeHarPath);

  const defaultOptions = {
    url: HAR_URL,
    update: HAR_UPDATE,
    updateContent: "embed",
    updateMode: "full",
    notFound: 'abort',
  };

  const mergedOptions = Object.assign({}, defaultOptions, options);
  console.log('HAR_UPDATE:', HAR_UPDATE);
  console.log('harPath:', harPath);
  await page.routeFromHAR(harPath, mergedOptions);
}

Mar 04 '24 17:03 phramusca

Well, finally, I removed the fixture and I am simply calling routeFromHar twice, once at beginning of the test, and right after what makes the calls different.

Mar 05 '24 13:03 phramusca

Maybe it would make sense to "consume" first matching entry so next time it'll use the second, third and so on, this seems to work for our case as POC, but could have other consequences tho... (?)

We're trying to mock a payment api which return's a PENDING state for the same endpoint x amount of times, but eventually it'll become COMPLETED. Not sure how to use a .har mock for this.

Apr 18 '24 10:04 SimonSikstrom

So I came up with another approach, based on this checkpoint idea but maybe a lot simpler in a lot of cases. So I added intercept all API requests and inject a Header with a sequence number when recording, this grantee's that you'll get the correct response for each request in the order that they occur:

var playwriteSequence = 1;
await page.route(/^http://myapiurl\/api\/.*$/, async(route, request) => {
    const headers = {
        ...request.headers(),
        'X-Playwright-Sequence' : playwriteSequence.toString(),
    }
    playwriteSequence++;
    await route.continue({ headers: headers });
});

await page.routeFromHAR('./har/happy.har', {
    url: '*/**/api/**',
    update: (true/false depending if you want to record or playback the recording),
});

Limitations: if you use a lot of 'forkJoin()' to make parallel requests then this might not be reliable and the checkpoint idea would be the go.

Aug 01 '24 02:08 clivepaterson

@clivepaterson I like your suggestion but it didn't work for me because a) page.routeFromHar needs to come before page.route otherwise page.route is overwritten and b) route.continue prevents other route handlers from being called. Additionally the sequence number didn't always match when replaying because the replay is a lot faster and thus some "waterfall" requests aren't sent during the replay (e.g. because the page doesn't need to load fully).

Anyway, here's an adapted version.

// Note: this needs to come BEFORE the middleware below
await page.routeFromHAR('./har/happy.har', {
    url: '**/api/**',
    update: true, // or false if not recording
});

// The following route acts as a middleware that adds a sequence number to each request's headers.
// This way subsequent requests to the same endpoint with the same parameters can be distinguished.
const requestUrlToSequence = new Map();
await page.route('**/api/**', async (route, request) => {
  const url = request.url();

  // Each request gets a sequence number starting with 1. If a URL is hit multiple times, the sequence
  // number is increased. This allows Playwright to match subsequent requests in the HAR files.
  const previousRequestIndex = requestUrlToSequence.get(url) ?? 0;
  const currentRequestIndex = previousRequestIndex + 1;
  const headers = {
    ...request.headers(),
    'X-Playwright-Sequence': `${currentRequestIndex}`,
  };
  requestUrlToSequence.set(url, currentRequestIndex);

  // Update the request headers and continue the request chain.
  await route.fallback({ headers });
});

If you add this as a fixture you can call it instead of page.routeFromHAR.

Update: In our case, some requests still weren't matched correctly. Turns out Playwright selects the request by counting matching headers.

What helped was adding more headers like this:

const headers = {
    ...request.headers(),
    'X-Playwright-Sequence': `${currentRequestIndex}`,
    'X-Playwright-Sequence-2': `${currentRequestIndex}`,
    'X-Playwright-Sequence-3': `${currentRequestIndex}`,
    'X-Playwright-Sequence-4': `${currentRequestIndex}`,
  };

Aug 26 '24 11:08 jkettmann

Just as alternative - I've made a package that allows to mock and replay network requests without using HAR. It stores cache on the filesystem and gives you full control over request matching. Here is an example.

Oct 08 '24 07:10 vitalets

Thanks for all the insights and sharing your solutions/workarounds! Think that should be helpful for people who end up here after googling.

@vitalets playwright-network-cache looks great! I'll try integrating it the next time I have this challenge. Thank you!

Oct 09 '24 11:10 leomelzer