msw icon indicating copy to clipboard operation
msw copied to clipboard

Support concurrent runs in Node.js

Open kettanaito opened this issue 3 years ago • 22 comments

What

I suggest for the setupServer (node-request-interceptor) API to support execution in a concurrent mode (i.e. multiple test suite at the same time).

Why

Concurrency is crucial in large projects, allowing for faster execution of tests.

Current behavior

Currently node-request-interceptor patches native request issuing modules per process, which allows irrelevant mocking rules to leak to another test suites, as they all run in a single process.

How

Look into how nock operates. It performs modules patching in a similar fashion, and I'd suspect it to support concurrent mode. We can learn how they did it and evaluate if we can adopt a similar solution.

kettanaito avatar Nov 24 '20 08:11 kettanaito

concurrency problem should happen only with multiple instances of setupServer. Under the hood when setupServer is called http, https and XMLHttpRequest are patched to start intercepting requests. So if I use a single instance across my application I'll not encourage any problem.

If I use multiple instances of setupServer could be possible that a test restores the patched modules before other tests, causing the issue.

It's very strange that nock is not having that problem, their restore function is very simple and don't have any kind of check 🤔

marcosvega91 avatar Nov 26 '20 19:11 marcosvega91

Thank you for the insights, @marcosvega91.

I suppose we can dedupe modules patching in node-request-interceptor, but that won't solve the issue of unrelated modules (i.e. test suite) being affected by the interceptors declared in other modules, as they still share the same process.

What we can investigate is how exactly modern test frameworks execute tests in parallel. For instance, do they provision forked processes? If so, perhaps we could bound request interception to the forked process instead of the parent process.

kettanaito avatar Nov 28 '20 10:11 kettanaito

I have tried to investigate today a little bit. it seems that nock is not having this problem simply resetting modules between tests. It is strange because patching native modules are shared between tests. I have tried also to run jest printing memory usage and it seems that we have some memory leak in node-request-interceptor.

image

marcosvega91 avatar Dec 01 '20 18:12 marcosvega91

@marcosvega91, thanks for investigating that! I believe that the memory leaks were fixed in https://github.com/mswjs/node-request-interceptor/pull/77. Regarding the patched modules reset, it's indeed peculiar, as MSW resets those patched modules in a similar fashion to nock.

kettanaito avatar Dec 14 '20 14:12 kettanaito

@kettanaito @marcosvega91 Seems like we need to deeper investigate, why we have a memory leak in node-request-interceptor, even though we are resetting those patched modules in a similar fashion to nock?

When it comes to the code, where do you think I can perhaps start by in order to deeper investigate it? @marcosvega91 How can I reproduce this memory leak issue in our current Repo with the tests we have?

Also, you mentioned we do reset those patched modules in a similar fashion to nock, where in the code are we doing that, I am looking for some sort of direction.

Thanks, haha, I love working with you guys :smiley: :smiling_face_with_three_hearts: :hugs:

tigerabrodi avatar Dec 14 '20 19:12 tigerabrodi

@tigerabrodi, thanks for the interest in this task. I'll try to share some details below.

Restoring patches

Each interceptor in NRI a function that, when called, patches a respective request issuing module to intercept requests, and returns a function to restore that patch. Below you can find the restoration points for two current interceptors: ClientRequest and `XMLHttpRequest.

I don't think that the memory leak Marco's mentioned is related to modules restoration. Most likely they were fixed in https://github.com/mswjs/node-request-interceptor/pull/77, but it would still be nice to double-check if there are any other leaks.

ClientRequest

https://github.com/mswjs/node-request-interceptor/blob/62f2a406ea24fdc35e660aa6f888c0ff99d979d5/src/interceptors/ClientRequest/index.ts#L106-L115

XMLHttpRequest

https://github.com/mswjs/node-request-interceptor/blob/62f2a406ea24fdc35e660aa6f888c0ff99d979d5/src/interceptors/XMLHttpRequest/index.ts#L29-L35

The best way to investigate this issue is to reproduce it. Try to set up a few tests with Jest, let all of them use their own as well as shared request handlers, and run them concurrently (Jest does that by default). Once unrelated tests start respecting request handlers other than those declared in them, you got the reproduction scenario.

Technically, this happens because modules patching is per entire process, so when multiple tests run a single process they all start respecting all the patches, leaking unrelated handlers to unrelated tests.

kettanaito avatar Dec 15 '20 10:12 kettanaito

@kettanaito @marcosvega91 I do not know I will ever get to this issue, am personally fully up with the OSS Raid Group aside from my side project + workshops at FrontendMasters.

tigerabrodi avatar Dec 15 '20 21:12 tigerabrodi

Hi there! That's my first time here in the MSW repo, so first of all, congrats to everyone involved; the project is fantastic. ⭐

I've started to learn more about the library's internal and come across this issue, so I'm trying to understand if I should run my Jest tests using the --runInBand config option in Jest.

Considering what I read here in this issue and my current understanding of Jest's parallelization strategy, I would expect it to be OK to run multiple Jest test suites in parallel.

In practical terms, I would expect it to be acceptable to use --maxWorkers=N where N > 1 instead of --runInBand if I create a single setupServer instance per test suite (node process), which I do.

Wrapping it up, if I create a single setupServer instance per Jest test suite (node process) via beforeAll/afterAll Jest's hooks, each process will patch the native request issuing modules without affecting other parallel processes.

I would appreciate it if you could help me to clarify this. Thanks!

fnmunhoz avatar Feb 06 '21 12:02 fnmunhoz

Hi, @fnmunhoz. Thank you for the feedback. I'll try to clarify things for you below.

Despite the parallelization in Jest, it still only spawns 1 process. The parallelization may happen via threads/child processes to achieve the execution of multiple test suites simultaneously. The request interception functionality in NodeJS for MSW patches the request issuing modules per process, so regardless of the parallelization it will affect unrelated tests, as they all run in a single process.

We could use some help in researching how exactly Jest implements parallelization. It won't give us a solution, but it may hint us as to the right direction. I believe the solution lies in handling parallel runs in node-request-interceptor (the library MSW uses for NodeJS requests interception).

kettanaito avatar Feb 06 '21 16:02 kettanaito

Hey @kettanaito thanks for the update!

I'm trying to simulate this problem in Jest, but until now I'm not able to make tests from different Test suites affect each other.

I'll share what I have tried, as you might spot what I might be missing.

I've created this repo with two identical test suites except for L9 where the endpoint changes, so we can identify it in the logs:

https://github.com/fnmunhoz/msw-concurrent-execution/commit/67f7cf07d3ee13470b3310717faf2a02d84cdc71

The Jest output for that is as in the screenshot:

concurrent-suites

By its logs, I couldn't see any issues, since each Test suite seems to respect its own HTTP handlers.

Do you think the way these test suites are defined should demonstrate the concurrency issue?

If not, do you have any suggestions on what I could try?

I'm in hope that since Jest executes each Test suite in a separate process, it might not be an issue if we use MSW with Jest as the test runner?

fnmunhoz avatar Feb 07 '21 23:02 fnmunhoz

@kettanaito

Despite the parallelization in Jest, it still only spawns 1 process. The parallelization may happen via threads/child processes to achieve the execution of multiple test suites simultaneously. The request interception functionality in NodeJS for MSW patches the request issuing modules per process, so regardless of the parallelization it will affect unrelated tests, as they all run in a single process.

Does this mean that I should run tests in band (jest --runInBand) if I have different request handlers per test? I've been battling randomly failing tests in CI for a few months now, and wondering if turning off parallelization would be the key here. I've recently refactored our setup to use a singleton for the MSW server, but this still causes the issue of data leaking across tests.

JCB-K avatar Mar 11 '21 15:03 JCB-K

@JCB-K at the moment we do recommend running them in band to prevent possible issues when request handlers affect unrelated tests. This issue is aimed at discussion and adding support for parallel test runs. That being said, it looks like not everybody experiences this issue, so there's a space for some setup difference being at play.

kettanaito avatar Mar 13 '21 00:03 kettanaito

I'm experiencing this issue now. This is what I've found so far. As far as I can see jest does not run multiple tests at the same time by default. It needs to be explicitly set up by using the test.concurrent functions.

It does however use a worker pool, where each test file is given to a worker. All tests within that suite will use the same worker process.

Therefore a global beforeAll() (e.g. in the jest.setup.js file) will be ran for each worker. --runInBand will still spawn one worker, but reuse that worker for all the test files. Setting --max-workers=1 effectively does the same.

If you have more tests(files) than workers, jest will reuse workers (but still run beforeAll for each file).

Regardless of max-workers, Jest will not use more than the number of cores - 1 in single run mode, and number of cores / 2 in watch mode.

I can reproduce this issue quite frequently while running with --max-workers=1 or --runInBand, but not consistently.

A common denominator in my case is that I have three tests in a file. One happy path where it uses the globally registered mock handlers, and two unhappy paths where one is experiencing a delay (to verify that the spinner is shown) and one where there is a network error (to verify the error message). Those two tests registers their own handlers on the same endpoint/URL.

They seem to occasionally intersect with each other, that is, the spinner tests gets the request with an error, and the error get the delayed request.

androa avatar Jun 11 '21 08:06 androa

I have found two things in my test suite that creates the flaky behaviour. Simply clearing the cache like this does not help (all the time):

afterEach(async () => {
  cache.clear();
  server.resetHandlers();
});

When adding a artifical delay like this, it seems to consistently work:

afterEach(async () => {
  await new Promise((resolve) => setTimeout(resolve.bind(null), 0));
  cache.clear();
  server.resetHandlers();
});

I suspect this might be connected to the other thing I've found, but not been able to prove/create a consistent reproducing case.

I test a component that uses SWR to load data, but in this case it can render instantly before the fetch is complete and update it as the data comes in. I have no way to differentiate the loading status from an empty dataset, so the test passes immediately.

But the request (and the mock) is still in-flight and active, so the next test (which should have data) hits the previous mock.

Workaround was adding an invisible loader that I can wait for to disappear before passing the test.

androa avatar Jul 11 '21 14:07 androa

@androa I suspect I’m also having a similar problem to you. I tried adding an artificial delay (without cache clear) and it didn’t work. I didn’t consider using invisible loader which is a good solution so that in the test you can determine when the request has completed. In many of our tests when the api completes it doesn’t result in a screen change. There’s also many tests where an api call is made but the test doesn’t need to wait for it to complete.

I want to spend some time to create a reproduction repo but haven’t got round to it. Hopefully I can find some time soon.

WayneEllery avatar Jul 11 '21 15:07 WayneEllery

Thank you @kettanaito for bring this wonderful library to us. I'm using jest, react-testing-library and msw for our tests. In our newest projects, we are mocking about 30 requests using msw. If we execute the tests on our local (mac os), it usually passes. But running on Travis, it's pretty common that there are some tests cases like this fail:

await waitFor(() =>
  // Appear when receive response
  expect(screen.getByTestId('some-id')).toBeInTheDocument()
);

On travis, we run with the config - npm test -- --coverage --maxWorkers=4. After reading this issue, I believe the reason is msw currently support single run, and when it has to support jest concurrent run (combine with the weaker hardware of CI), the responses will take a longer time (sometimes more than 1000ms), while waitFor from react-testing-library have timeout of 1000ms (I know that current recommendation is run in --runInBand mode), so the above scenario randomly fails. I just want to FYI you the insight when I have to deal with those kind of broken tests. I belive when msw supports concurrent runs, the issue will be disappeared.

For folks that dealing with problem tests like this pass on local but fail on CI/CD (or in run all mode)

await waitFor(() =>
  // Appear when receive response
  expect(screen.getByTestId('some-id')).toBeInTheDocument()
);

The reason likely is msw responds more than 1 second (while waitFor has default timeout of 1 second), just increase the timeout will resolve problem.

await waitFor(() =>
  expect(screen.getByTestId('some-id')).toBeInTheDocument(), 
  {
    timeout: 2000,
  }
);

nvh95 avatar Nov 05 '21 17:11 nvh95

I've had some time to think about this, sharing my thoughts below.

First of all, the issue is relevant only to using runtime handlers with parallel test runs. If you're using the same (global) set of handlers you won't be affected: handlers are idempotent in nature. When using runtime handlers (server.use()), you modify the global list of handlers by pretending a runtime handler. Since all tests refer to the global handlers list, in parallel runs if you prepend one handler in one test, it may affect requests in the irrelevant test (module).

I can propose to solve this by ensuring identity of requests and runtime handlers.

Request identity

Currently, each captured request has an id. That's okay. On top of that, I suggest to add something like moduleId/modulePath that represents the module which issued this request. This is only relevant in Node.js.

{
  id: "uuid-of-request",
  "modulePath": "/User/octocat/GitHub/test/Profile.test.tsx"
}

Runtime handler identity

Similar to requests identity, I propose to add moduleId/modulePath only to runtime handlers (those defined via server.use()). As long as the request and the runtime handler share the moduleId, it guarantees that they were issued/prepended in the same module and can affect each other.

// test/index.test.ts
it('some test', () => {
  // Set the "moduleId" to this "GET /foo" handler.
  server.use(rest.get('/foo', resolver))

  // Add the "moduleId" to this captured request
  // (done internally in @mswjs/interceptors). 
  fetch('/foo')
})
  • The moduleId is not set in the browser.
  • The moduleId is derived from the module and is fixed.
  • The moduleId affects what handlers are considered relevant in the lookup phase. We can first check the moduleId equality (if set at all) before executing handler.predicate().

I find the modification of request/handler instances more acceptable as opposed to creating a handlers list relevant to the module because we can keep our internal handler lookup/resolution phase intact with the proposed change—the source of handlers always remains the same.

In theory, this should allow us to support parallel test runs. Curious about what others think.

kettanaito avatar Mar 18 '22 16:03 kettanaito

Even more peculiar, you can't have beforeAll setup when using test.concurrent in Jest (see https://github.com/facebook/jest/issues/4281). That we can't really solve, it's an issue with Jest. So far I'm trying to write a reliable test suite where I can reproduce this issue, failing to do so.

kettanaito avatar Mar 18 '22 16:03 kettanaito

Here's a hacky workaround with constraints that I've come up with to get msw to work in concurrent environments. It's not ideal, but it can be a stop-gap solution for situations when running tests in serial order is not an option.

  1. Define empty server const server = setupServer()

  2. No longer use server.resetHandlers() as it will erase the handlers of other tests.

    • Note: it would be nice if msw had a method to remove individual handlers from a server, and or display a list of handlers currently active or archived within a server for debugging
    • Instead use res.once() to create runtime handlers in your test cases that are disposed after a one-time use. This is currently the only way to dispose of individual handlers without erasing other handlers.
    server.use(rest.get('someEndpoint/', async (req, res, ctx) =>
    	res.once(ctx.json(...)))
    );
    
    
  3. Colocate tests in one file per endpoint. All tests for that endpoint, live in one file now.

    • Jest tests within one file run serially, even if the entire test suite is being run in a concurrent environment.
    • This will ensure that msw handlers do not conflict with one another for that endpoint.
    • Using res.once() ensures that the handler is only going to be invoked from within that test, and become inactive immediately after.

AlexeiDarmin avatar Aug 23 '22 19:08 AlexeiDarmin

Hey, @AlexeiDarmin. Thanks for the effort above!

it would be nice if msw had a method to remove individual handlers from a server, and or display a list of handlers currently active or archived within a server for debugging

You can list currently active handlers via server.printHandlers(). Alas, it prints the handlers summary stdout instead of returning a list. If you are interested, it'd be awesome to have a change where printHandlers() returns a list of handlers so people could migrate to the new API like so:

-server.printHandlers()
+console.log(server.listHandlers())

An interesting point on utilizing res.once().

kettanaito avatar Aug 24 '22 11:08 kettanaito

Here's a failing concurrency test I got working for a single file https://github.com/mswjs/msw/pull/1370

I was not able to reproduce a multi-file failing test, where multiple files actually share one instance of a server with race conditions. With what I've learned about jest workers I'm starting to think this case may not be possible?

My current understanding of jest workers (thanks @androa for your earlier comment, it helped):

  • Each jest worker is it's own sandbox, with it's own globals.
  • Jest workers pick up one file at a time
    • call beforeAll per file
    • call afterAll per file
    • call beforeEach per test within a file, but only after beforeAll
  • Once a jest worker picks up a new file, it has no knowledge of anything before the current test - so it calls setupServer again along with all the lifecycles listed above on a fresh instance of msw.

I suspect the msw community is encountering two separate race conditions that are causing intermittent test failure from within singular files.

  • Calling resetHandlers() in beforeEach which incorrectly removes other tests' handlers.
  • While testA is awaiting something, a new handler is attached in testB which will now take precedent and intercept the request from testA.

If this is true then the moduleId proposal may be insufficient for differentiating concurrency from within one file.

AlexeiDarmin avatar Aug 24 '22 23:08 AlexeiDarmin

Outstanding work, @AlexeiDarmin! 👏 Will provide my feedback in the pull request.

kettanaito avatar Aug 25 '22 11:08 kettanaito

What is the use case and requirement?

guest271314 avatar Oct 22 '22 21:10 guest271314

@guest271314, the use case is to ensure the isolated application of runtime handlers (overrides) when running multiple tests relying on the same handler in parallel. The requirement is, well the same. I think the comments above will give you enough understanding about the issue.

kettanaito avatar Oct 23 '22 12:10 kettanaito

If I understand the requirement correctly, it is to run multiple instatnces of your tests at the same time, "paralellism"?

https://www.reddit.com/r/javascript/comments/y9whqr/comment/itc8do9/?utm_source=share&utm_medium=web2x&context=3

What do you mean by "parallelism"?

https://github.com/mswjs/msw/issues/474

I don't know why you would need node executable to intercept requests.

I would just use multiple ServiceWorkers, fetch(), Request, and respondWith() to test multiple instances of clients making multiple requests https://plnkr.co/edit/Cees43q7hrPsdmFK?open=lib%2Fscript.js.

guest271314 avatar Oct 23 '22 13:10 guest271314

@guest271314, let me clarify a thing or two. Test runners like Jest usually support "parallel runs" what they do internally is not a concern for this ticket. The end result is that there may be X amount of test files running in parallel. Those test files have test cases. Those test cases can make HTTP requests. Some of those cases may want to handle those HTTP requests differently.

Here's a scenario for you to get a grasp of the issue:

// testA.test.js
it('fetches user detail', async () => {
  await fetch('https://api.example/user')
})
// testB.test.js
import { rest } from 'msw'

it('handles user errors', async () => {
  server.use(
    // For GET https://api.example.user request ONLY within this test
    // respond with a 500 response.
    rest.get('https://api.example.user', (req, res, ctx) => res(ctx.status(500))
  )
  await fetch('https://api.example/user')
})

MSW already has request interception, runtime overrides via server.use(). The issue here is that these two tests, if run in parallel, may produce intermittent results. The first test may accidentally receive a 500 response coming from the runtime override from the second test. This used to happen for people in the past, I have no idea if this still happens now.

This very closely depends on how a test runner implement parallelism. For example, if those two tests run in separate threads, then each of their setupServer setup will be unique and there will be no way for them to have a shared state. If parallelism is achieved differently, well, then shared state may be possible.

There quite a few examples of this above, please read them. People go into much better detail of what's happening that I do in this comment.

kettanaito avatar Oct 24 '22 10:10 kettanaito

@kettanaito

Test runners like Jest usually support "parallel runs" what they do internally is not a concern for this ticket. The end result is that there may be X amount of test files running in parallel. Those test files have test cases. Those test cases can make HTTP requests. Some of those cases may want to handle those HTTP requests differently.

I think the code I posted achives the requirement - without bothering with node executable at all.

Just append N <iframes> to an HTML document and utilize fetch(), EventSource, et al. to make requests and serve Responses accordingly.

guest271314 avatar Oct 25 '22 02:10 guest271314

@kettanaito Would it make sense for server.use() to create a new handlers array mapped to the moduleId for that test (rather than mutating the global one)? Tests could still default to using the global array, but any test that calls server.use() would get it's handlers from a map at it's moduleId key. This would enable isolation for tests using runtime handlers without the unnecessary memory use of every test getting it's own handlers array by default.

clarksam19 avatar Nov 02 '22 17:11 clarksam19

@kettanaito is there any workaround for that?

BrunoQuaresma avatar Nov 17 '22 18:11 BrunoQuaresma

Is there any work going on with this? Being able to override responses for a given endpoint in tests confidently is basic to using msw. You don't want to just test happy paths 😅

ubbe-xyz avatar Feb 07 '23 15:02 ubbe-xyz