checktestdata icon indicating copy to clipboard operation
checktestdata copied to clipboard

Improve performance

Open meisterT opened this issue 9 years ago • 34 comments

For some checktestdata programs performance is terrible. Are there low hanging fruits that improve the general performance of checktestdata?

meisterT avatar May 31 '16 18:05 meisterT

Some issues to look into:

  • FLOAT is slow: changing from REGEX("[0-9.]+") to FLOAT(0,6.2832) made reading 10^5 floats take 44 seconds longer (total runtime including reading other stuff went from 11 to 55 seconds)
  • INARRAY is even slower: 10^5 checks of an INT inside an array of 10^5 INTs increased the runtime of the same testcase to 7.5 minutes.

eldering avatar Aug 11 '16 04:08 eldering

These combined fixes improved checktestdata runtime on this testcase from ~8 min to ~2 sec.

eldering avatar Aug 13 '16 19:08 eldering

An example where I gave up on using checktestdata and switched to python due to speed: problem I (Interception) from NCPC 2016. Problem package available here: http://ncpc.idi.ntnu.no/ncpc2016/ncpc2016-packages.tar.bz2 (there is an unused checktestdata script included).

There is lots of data so maybe there is not much to do, but one feature that would have been helpful here would be a version of the UNIQUE function that compares entries as unordered (multi)sets rather than ordered tuples. Then one could have skipped the annoying if-swap in the main loop, which takes a bit of the time (and pollutes the code). (That feature alone wouldn't have made the script fast enough, but it would at least have made it a bit less slow.)

austrin avatar Oct 17 '16 09:10 austrin

I've tested this on my computer. The test case 09.max_ans.in check runs in 31 seconds, without the swap it takes 20 seconds. That's a decent bit less, but I'm not sure it warrants introducing a completely new command. I also analyzed the call graph with callgrind and a quick glance did not show any clear targets for optimization. A fair amount of time (14%) Is spent in GMP functions, another 10% in boost::variant, and 8% just in std::__lexicographical_compare_impl over GMP integer types.

eldering avatar Oct 21 '16 02:10 eldering

That's a decent bit less, but I'm not sure it warrants introducing a completely new command.

Right, I agree that this speed improvement on its own is not sufficient reason to add such a command, but there may be other small reasons as well which together make it worth it. Another such reason could be that this type of constraint is somewhat common and the swap dance is a bit annoying and makes the scripts harder to read, so such a command could be nice to have regardless of performance issues. I don't really have a strong opinion on this, but at least wanted to bring up the possibility for you to consider.

austrin avatar Oct 21 '16 06:10 austrin

@eldering having been frustrated by a few slow ctd scripts in the past weeks, I was wondering a bit more about potential speed improvements. One thought: would it be possible to add some sort of pragma-ish directive to force 64-bit integer arithmetic instead of arbitrary precision (since GMP takes a lot of the time)? I guess one would then want to do overflow checks every time one does an operation (and raise an error on overflow), but maybe it would still be faster, in particular for the common "read a million integers between 1 and 1000" type of format.

austrin avatar Apr 22 '18 13:04 austrin

[Funny, I was just looking into a few minor checktestdata issues, including this one.]

Yes, I think that would be a good avenue to explore, and would make sense if it improves performance significantly. I currently don't have the WF2018 problemset available. Is is (publicly) available somewhere? Otherwise, I'd have to wait a little before our backups are online.

eldering avatar Apr 22 '18 13:04 eldering

I don't think so, but one set you could look at is NWERC 2017, in particular the problems ascendingphoto and factorfree. The validators look almost the same, except that one stores the values in an array (without using them for anything), and that one is about 3x slower than the other. But even the faster one is a bit sluggish -- on my machine it takes around 1.5s just to read a million numbers which with a lot of test files gets a bit annoying.

austrin avatar Apr 22 '18 16:04 austrin

I have an experimental branch of checktestdata in antlr_rewrite that seems to be a lot better performance wise, on the other hand it is by far not as well tested and it currently doesn't really have helpful error messages for non conforming input.

TPolzer avatar Jul 12 '18 20:07 TPolzer

Well, the testsuite passes, so that's already quite something?

thijskh avatar Jul 13 '18 06:07 thijskh

What's the plan for @TPolzer's rewrite? Can we merge that?

meisterT avatar Oct 03 '18 07:10 meisterT

The one area where it would be a clear regression is that it cannot generate inputs. Is that a feature people actually use?

TPolzer avatar Oct 03 '18 20:10 TPolzer

It also lacks an option to be generous with white space, but that would probably be easy to implement.

TPolzer avatar Oct 03 '18 20:10 TPolzer

Summary from a short discussion on #domjudge on the antlr rewrite: I'd prefer to keep using Make instead of Bazel since Make is available everywhere. If we can improve a few of the missing features (and aim for feature parity in the longer run), then I'm fine with replacing the current code with this rewrite.

eldering avatar Oct 06 '18 13:10 eldering

@TPolzer can you please describe how to build the current version of the antlr branch, i.e. which dependencies have to be installed?

meisterT avatar Oct 28 '18 14:10 meisterT

Yes, .travis.yml isn't too bad a guide, necessary build dependencies are

  • boost development headers
  • gmp development headers
  • a recent(ish) g++
  • bazel

Then do 'bazel test test' to run the testsuite 'bazel build checktestdata' to build the main executable (which will end up at bazel-bin/checktestdata)

TPolzer avatar Oct 28 '18 16:10 TPolzer

I'm not a big fan of make and actually think that having the build depend on bazel and then host prebuilt static binaries as github releases might be a nicer experience for everybody involved.

TPolzer avatar Oct 28 '18 16:10 TPolzer

Can you put your version in a separate directory? I don't like replacing the current version completely for a number of reasons. First, the current version still has features that your code does not (yet) support, such as generating test data. Secondly, Bazel is neither Debian or Ubuntu, which makes that make at this point is still more of an off the shelf build system. I agree we can release static binaries, but that can be done either way, and I don't think that should be the only form of releases, for example, being able to create debian packages is also a nice feature, which doesn't work with Bazel at the moment.

eldering avatar Oct 29 '18 22:10 eldering

I was thinking about this issue a bit more. We have to separate the group of contributors to checktestdata from the group of users of checktestdata to come up with a qualified decision how to proceed.

For users, the benefit of the rewrite is a much faster checktestdata than the old version - on average it's close to a 10x improvement. Users typically don't care how we achieve that speedup, the normal user also doesn't need to build the software from source. We haven't been updating the checktestdata debian package in a while and haven't been distributing static binary releases, but we probably should do both in any case. We should probably also look into distributing it via homebrew.

For contributors on the other hand, the build system matters more, but it's also a much smaller group of people. Overall, we had contributions from 8 people in total.

For Bazel there's a debian package which can be installed like this:

echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -
sudo apt-get update && sudo apt-get install bazel

So it's not hard to install (and keep it up to date).

btw. Bazel supports building debian packages itself: https://docs.bazel.build/versions/master/be/pkg.html#pkg_deb (disclaimer: I've never used the packaging rules)

Note that I was also trying to add a Makefile in the new branch while I was trying to make up my mind. However, the antlr version in debian stable cannot generate cpp code: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=902798

Regarding the idea of putting the binary in a separate directory: I don't really like this because it increases the maintenance overhead. We have the original version, the haskell implementation and now the antlr_rewrite. Any change we make we have to implement in every version either slowing down development or diverging the implementations. We saw how that didn't work for the branches in the DOMjudge repository so I would rather prefer not do the same mistake here.

To summarize: users benefit greatly from the new version, while it gets a little bit harder for us as contributors. I think this is a fair tradeoff to make.

I propose to do the following:

  • [ ] investigate if the dependency on g++-7 is necessary or whether we can make it work with g++-6
  • [ ] add more test data to make sure that the new version is actually correct
  • [ ] improve error messages
  • [ ] add option to generate test data in the new version
  • [ ] switch to the new version with static releases
  • [ ] look into homebrew

meisterT avatar Nov 14 '18 09:11 meisterT

I propose to maybe also look into making a Docker image with a checktestdata binary in it when we do static releases.

I agree with the rest of your ideas

nickygerritsen avatar Nov 14 '18 09:11 nickygerritsen

We have to separate the group of contributors to checktestdata from the group of users of checktestdata to come up with a qualified decision how to proceed.

Some further comment/question on this. While not being a contributor, from the view of problemtools (living downstream of checktestdata and maybe contributing a non-negligible fraction of checktestdata's users?) it would be desirable if checktestdata continues to be easily buildable using standard packages. But that 10x speedup would be so sweet to get.

If I understand correctly the bazel-vs-make-vs-whatever-floats-your-boat issue is in principle independent of the rewrite, and there is no inherent reason why it would have to use bazel?

Is the bazel part only needed to build the parser, or is it needed also for buliding the binary from the parsers (i.e., as on the current release branch)?

austrin avatar Nov 14 '18 21:11 austrin

Well, currently bazel is used for everything, and it's probably cumbersome to have two build systems for one project.

Just to rule out that possibility: would it work for you to distribute the released checktestdata binary instead? Or that you require installing checktestdata separately like you require installing other packages (https://github.com/Kattis/problemtools#requirements-and-compatibility) and also compilers.

meisterT avatar Nov 15 '18 07:11 meisterT

I am still not convinced we need to use Bazel at all. I prefer make for the same reasons Per also mentioned, and I'm not a fan of installing software from sources outside of Debian.

On November 15, 2018 8:08:12 AM GMT+01:00, Tobias Werth [email protected] wrote:

Well, currently bazel is used for everything, and it's probably cumbersome to have two build systems for one project.

Just to rule out that possibility: would it work for you to distribute the released checktestdata binary instead? Or that you require installing checktestdata separately like you require installing other packages (https://github.com/Kattis/problemtools#requirements-and-compatibility) and also compilers.

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/DOMjudge/checktestdata/issues/3#issuecomment-438939915

-- Sent from my Android device with K-9 Mail. Please excuse my brevity.

eldering avatar Nov 15 '18 08:11 eldering

Jaap, what's your concrete suggestion then? Waiting until the next debian version is released?

meisterT avatar Nov 15 '18 08:11 meisterT

I'm not sure I follow. I'm ok with depending on antlr4 from Debian Buster (since that's at least going to be in the upcoming release, and thus also in Ubuntu). But I'm proposing to keep/rewrite the build system in make instead of Bazel. Is there any reason this cannot be done?

eldering avatar Nov 15 '18 08:11 eldering

Ah ok, I was implicitly assuming that you insist on debian stable. If that's not the case, I'll try to cook something up that works on buster without bazel.

meisterT avatar Nov 15 '18 08:11 meisterT

In case anyone else is interested in how much faster the antlr rewrite is, I made the following plot:

ctd_antlr_rewrite_speed

Each dot is a checktestdata script for a problem. Its x coordinate is the sum of runtimes (in seconds) of the script on all test cases using checktestdata version 20180411, and its y coordinate is the sum of runtimes using the antlr rewrite. The dotted line is the y=x line. Note that the plot uses log-log axes. (The plot excludes all scripts where the two checktestdata versions currently produce different results, cf #7.)

There's really only one script where the antlr-rewrite does significantly worse than the current checktestdata, and I've made it available here if you want to investigate it further: https://gist.github.com/austrin/65576baa9aa8601e3b06da2cc8cad26d (yes, it's a funny one)

austrin avatar Nov 16 '18 17:11 austrin

That tight cluster around 100ms where there seems to be a small constant factor regression looks interesting, too. The a bit larger regression on the very fast cases I've also observed and I think is mostly startup overhead of the antlr parser itself.

TPolzer avatar Nov 16 '18 18:11 TPolzer

Thanks Per. Do you mind sharing the raw data points?

meisterT avatar Nov 16 '18 19:11 meisterT

@TPolzer re the cluster around 100 ms, I think it is also just about the startup overhead you mention. Below is a similar plot for individual test cases (i.e., each dot is a run of a checktestdata on a single test case). As you can see the cases where the rewrite is slower are almost entirely runs in the sub-10 ms range (and then many problems consist of around 20 or so such such test cases which then causes the cluster in the previous graph).

ctd-runtimes

austrin avatar Nov 17 '18 15:11 austrin