karma icon indicating copy to clipboard operation
karma copied to clipboard

Test sharding / parallelization

Open vojtajina opened this issue 11 years ago • 43 comments

Allow splitting a single test suite into multiple chunks, so that it can be executed in multiple browsers and therefore executed on multiple machines or at least using multiple cores.

vojtajina avatar Mar 29 '13 08:03 vojtajina

Hey, I saw Issue #412 and it's exactly the sort of functionality I am looking for.

Has there been any movement on this? If there aren't any plans, do you have any high level thoughts on how you would like this to be implemented?

Cheers.

shteou avatar Jul 10 '13 13:07 shteou

I definitely wanna do this, it's hard to say when...

We need a way to group files. It should be dynamic (not that developers have to manually group them), so that it's possible to scale (easily change the number of file groups - the number of browsers where we execute in parallel). When using a dependency management system (eg. Closure's goog.require, goog.provide; see karma-closure), this will be much easier and simplified (because we have can figure out the dependency graph and therefore only load the files that are really needed). Probably "label" test files and then split these test files (assuming each test file has same amount of test; later we can add some sort of cost/labeling).

Web-server has to understand these groups Web-server, when serving context.html has to understand these groups and also know what browser is requesting. Probably additional argument to "execute" message to the client(browser), and the client, when refreshing the iframe uses this "groupId" as a query param, eg. context.html?group=2. This grouping should be probably done in fileList (or once the file list changes; similar to what karma-closure does now; we should make the "file_list_modified" event async).

Currently the resolved files object looks something like:

{
  served: [
    // File objects (metadata, where the file is stored, timestamps, etc.)
  ],
  included: [
    // File objects
  ]
}

After this change, it would be:

{
  served: [
    // File objects (I'm thinking we might make this map rather than an array)
  ],
  included: {
    0: [
      // files included in the group=0 browser
    ],
    1: [
      // files included in the group=1 browser
    ]
  }

vojtajina avatar Jul 14 '13 23:07 vojtajina

@shteou Would you be interested in working on this ? I would definitely help you...

vojtajina avatar Jul 14 '13 23:07 vojtajina

Also, does it make sense ?

I mentioned karma-closure couple of times, that is this plugin: https://github.com/karma-runner/karma-closure It is very "hacky" (the way how it hooks into Karma; if you had multiple plugins like this, it would end up really badly;-)), but it does interesting things - it basically analyze the dependencies and therefore can generate list of included files, based on the dependencies. So before this "resolving" we would group the test files and karma-closure would resolve each group separately...

vojtajina avatar Jul 14 '13 23:07 vojtajina

I'd just like to note that we (Wix) tried this out. danyshaanan created a test case for this. He enabled karma to split up the loaded tests and run them in several child processes.

It didn't work out too well, since many of our tests require loading a large part of our code, and therefore the setup time and loading of all the packages for each child process was too costly. Depending on the number of child processes, it usually ended up running slower, though it did come close to running in the same amount of time.

We tried it out when we were at about ~2,000 tests. We now have over 3,500 tests in this project, so it might be worth revisiting this.

If anyone else is working on this or has another angle for this, we are also more than happy to help.

EtaiG avatar Dec 27 '14 23:12 EtaiG

@EtaiG I have not started working on this, but as my project currently exceeds 3000 tests as well it's becoming something I want to invest some time into as well.

jbnicolai avatar Jan 09 '15 10:01 jbnicolai

A note about dynamic creation of groups (@vojtajina) - We should be aware of how this affects tests that happen to have side effects. Imagine tests B, C, and D, and a naive alphabetical devision of tests into two groups - {B,C} and {D}. Lets say C has a devastating side affect and I'm adding test A, hence changing the grouping to {A,B} and {C,D}. Now D will fail, just because the grouping changed.

Of course, tests shouldn't have side effects, but this case is bound to happen, and might be very confusing to users.

danyshaanan avatar Jan 11 '15 16:01 danyshaanan

I think we can ignore this case and let people who encounter it deal with their own problems. We can also enable the API for this to allow the consumer to decide the grouping. On Jan 11, 2015 6:16 PM, "Dany Shaanan" [email protected] wrote:

A note about dynamic creation of groups (@vojtajina https://github.com/vojtajina) - We should be aware of how this affects tests that happen to have side effects. Imagine tests B, C, and D, and a naive alphabetical devision of tests into two groups - {B,C} and {D}. Lets say C has a devastating side affect and I'm adding test A, hence changing the grouping to {A,B} and {C,D}. Now D will fail, just because the grouping changed.

Of course, tests shouldn't have side effects, but this case is bound to happen, and might be very confusing to users.

— Reply to this email directly or view it on GitHub https://github.com/karma-runner/karma/issues/439#issuecomment-69499785.

EtaiG avatar Jan 11 '15 17:01 EtaiG

+1

We have a large number of tests at work, and sharding would be very beneficial. As you said, there shouldn't be side effects between tests, and for anyone who doesn't want to remove the side effects, I'd say they just don't get to run their tests in parallel :)

As long as the sharding is opt-in I think the confusion should be manageable.

scriby avatar Mar 30 '15 18:03 scriby

Hey @EtaiG and @danyshaanan, very interested in this experiment you mentioned. Is this code accessible somewhere? I'd very much like to experiment with this a bit - maybe your work could give me a headstart!

LFDM avatar Apr 20 '15 20:04 LFDM

@LFDM We have nothing to share at the moment, but I'm just about to rewrite a smarter version of it in the coming couple of weeks. I'll try to do so in a way I'll be able to share. Feel free to ping me about this in a week or so if I'll not post anything by then.

danyshaanan avatar Apr 20 '15 21:04 danyshaanan

:+1: Sounds like a great idea to me; Would love to see any progress updates on this @danyshaanan!

AlanFoster avatar May 09 '15 10:05 AlanFoster

:+1: this would be great.

park9140 avatar Aug 10 '15 18:08 park9140

@danyshaanan! any news?

ghost avatar Sep 08 '15 15:09 ghost

@aaichlmayr : yeah, bad ones - It didn't seem to work out that well; This feature's spec is non trivial, therefore the implementation was not as clean and I would hope, and the benefits were not convincing enough to go ahead with it, so we scrapped this plan.

danyshaanan avatar Sep 09 '15 06:09 danyshaanan

Thanks for the info

ghost avatar Sep 09 '15 13:09 ghost

As long as the sharding is opt-in I think the confusion should be manageable.

Totally agree. Think it's fine to launch as an experimental feature with this requirement. Would love to see this land and would be happy to help bug-hunt, etc.

booleanbetrayal avatar Dec 16 '15 15:12 booleanbetrayal

Hi, has the feature been shipped already?

FezVrasta avatar Dec 21 '15 14:12 FezVrasta

:+1: anyone making headway on this?

navels avatar Apr 11 '16 17:04 navels

So is this a feature yet?

presidenten avatar Aug 17 '16 14:08 presidenten

I dont get it? Did I do somethink wrong? What did I miss? Why the thumbs down?

presidenten avatar Aug 18 '16 12:08 presidenten

If you reply to an issue, all the subscribed people get an email and a notification. If you just want to add a +1 on the issue, do so adding a thumb up reaction to the first post (or to the one with more upvotes), in this way you don't flood the whole list of subscribers.

FezVrasta avatar Aug 18 '16 12:08 FezVrasta

@dignifiedquire Could you lock this one like #1320 with a help:wanted label? Thanks!

Florian-R avatar Aug 18 '16 12:08 Florian-R

When using a dependency management system this will be much easier and simplified

True, but it's a hack to get these systems to work in karma in the first place right? I'm tempted to put that consideration aside for now.

I agree with the suggestions made by @vojtajina https://github.com/karma-runner/karma/issues/439#issuecomment-20945728 (even if they are three and a half years old :)

I thinking

module.exports = function(config) {
  config.set({
    files: [
      'lib/lib1.js',
      'lib/lib2.js',
      {'other/**/*.js', included: false},
      {pattern: 'app/**/*.js', grouped: true},
      {patttern: 'more-app/**/*.js', grouped: true}, 
    ],
  });
}

And then the resolved object would be

{
  served: [
    'other/file1.js',
    'other/files/file2.js'
  ],
  included: {
    common: [
      'lib/lib1.js',
      'lib/lib2.js'
    ],
    groups: {
      0: {
        'app/file1.js',
        'app/file2.js',
      },
      1: {
        'app/file3.js',
        'more-app/file.js',
      }
    }
  }
}

We could reuse concurrency, though a default of Infinity is bad -- most commonly we want to run as many tests as we have cores.

We'd probably want a groups config. I could divide my code into 10 groups, and run with concurrency 3 until they are all done. As @EtaiG pointed out, there is a balance between fine-grained scheduling for better utilization, and overhead of loading common files.

pauldraper avatar Dec 13 '16 19:12 pauldraper

I'd hate to have to group tests by hand. What if we had the system uses the regular configuration as a starting point, and have it build up and refine an optimal parallel test plan over time? Along the way it might be able to discover any dependency chains (that shouldn't be there, but might be). It could flag those as "todo" items for developers, but could work around those should it discover them. Whatever it does, it'd be good for it to be able to deal with changes in the test code gracefully, so it would not have to recompute the whole thing all over again when a single test is added (or removed).

I'm sure the computational complexity would be enormous for getting at the very best configuration, but maybe some rough heuristics would get us reasonably close.

habermeier avatar Dec 28 '16 22:12 habermeier

really, just copy what jest does. it's fine

FezVrasta avatar Dec 28 '16 23:12 FezVrasta

I'd hate to have to group tests by hand.

Not sure if I understand this right, but I wasn't suggesting that.

{pattern: 'app/**/*.js', grouped: true},
{pattern: 'more-app/**/*.js', grouped: true}, 

grouped just means "these are the files that are eligible for sharding", as opposed to the common library files that are in all the tests groups. Karma then generates a number of arbitrary groups automatically from those locations. In the example, the generated groups were app/file1.js,app/file2.js and app/file3.js,more-app/file.js.

I suggest a groups config option for the number of groups. It can be tuned to weigh scheduling efficiency against startup overhead.

pauldraper avatar Jan 02 '17 23:01 pauldraper

Is this dead in the water?

brandonros avatar Apr 20 '17 18:04 brandonros

Hello - Here is my proposal for running tests in parallel.

This is a very simple sharding strategy - But it should provide speedup just by using multiple processors on the machine. This is meant mostly for local development and not much for CI runs(where remote CI setup costs far outweigh the speed gains of parallelization)

karma.js changes:

  • The root Karma URL can take in 3 extra URL parameters - shardId, shardIndex, totalShards
  • This is passed on to the context iframe

Jasmine(or Mocha etc.) Adapter changes:

  • In the context page, the Jasmine(or mocha) adapter can process the shardIndex and totalShards parameters
  • The adapter passes the shardId, shardIndex and totalShards in the "info" object it uses while connecting back to the Karma sever(See section below on how it's used)
  • The adapter walks the suite/spec tree and collects all leaf specs in an array
  • The adapter uses a very simple sharding strategy - It runs the subset of tests [(shardIndex/totalShards * totalTests) -> ((shardIndex + 1)/totalShards * totalTests)]

Karma server changes:

  • The server now has logic to wait for all shards to connect before starting a test run
  • This is so that the test execution doesn't immediately start when the browser instance corresponding to the first shard connects and run once more when the rest of the shards connect.
  • Server uses "shardId" and "totalShards" to determine whether enough number of sharded browsers have connected with the same "shardId"

Chrome(or other launcher) changes:

  • "ChromeSharded" (and "ChromeHeadlesSharded etc.) is a new type of launcher that launches Chrome with N different tabs - each with the same shardId and totalShards and the appropriate shardIndex
  • Default number of shards = Number of processors on the machine
  • Overriden by some Launcher flag / Environment variable

Reporter changes: (I need to flush this out more. Any ideas welcome here)

  • No changes in initial version - For local runs reporter output doesn't matter? Prints error from any of the shards on the console
  • Ideally reporter can collate all results from all the shards into a single report

vikerman avatar Jun 06 '17 04:06 vikerman

I'm going to try to get this done this weekend. I don't know anything about this project or the codebase, but I think a ton of people would be saved a ton of time if I can figure something out. Maybe different tabs running on different ports? Could get hairy...

brandonros avatar Aug 02 '17 17:08 brandonros