rwe icon indicating copy to clipboard operation
rwe copied to clipboard

TDF parser runs very slowly in MSVC debug builds

Open MHeasell opened this issue 5 years ago • 6 comments

With an MSVC debug build, when you start RWE, nagivate to Skirmish, then open the map dialog to pick a map, RWE appears to hang.

Also, when you load the game, RWE appears to hang, or take a very long time.

Really what's happening is that RWE is parsing TDFs, and this is very slow in MSVC debug builds.

I believe this is because, in MSVC debug mode, iterators are very slow. This is because there are lots of additional debugging checks on iterators in debug mode.

They can be controlled with this flag:

https://docs.microsoft.com/en-us/cpp/standard-library/iterator-debug-level?view=vs-2019

however, changing the flag requires also building libraries we depend on with the different flag setting, too.

This makes development using MSVC / Visual Studio quite painful, because you want to run in debug mode, but the performance is unacceptably bad (as you typically want to load up into a full game to test unit behaviour).

MHeasell avatar May 30 '19 09:05 MHeasell

Would it be of any use to load game data from TDF files once then serialize them so the game would just have to deserialize them when starting? For triggering a refresh of the game data a checksum could be computed over the TDF files then compared at game start.

Also, maybe it would ne necessary to put the user interface in a separate thread/process to prevent freezing the UI among other things.

What do you think?

PHLF avatar Jun 23 '19 10:06 PHLF

I'm reluctant to do anything too smart in code to solve, as the poor performance is only observed in MSVC debug builds, and MSVC checked iterator debug features are well known for being very slow. I'm not really willing to take on extra code complexity to work around this when performance under other compilers and modes is acceptable.

I do think that loading the map list once, rather than every time the dialog opens, would be a good improvement. (And I remember seeing an implementation of this in your recent PR)

MHeasell avatar Jul 21 '19 11:07 MHeasell

I'm curious how much faster it goes in other builds, like can we quantify it? Mine was parsing for minutes with MSVC debug... I'm going to see if there's something simple that doesn't cripple debug ability.

KevinHake avatar Apr 28 '21 22:04 KevinHake

While debugging I measured the time between hitting "Select Map" and when the view changes (UI frozen during the wait): Debug build,: 4min58s Release w/debug info: ~5s Even in release mode, looks like 3% of the time is reading from file, and 92% parsing the data, so could still be a lot of room for improvement. Like if 5s became .5s, even with the crap debug build it would become more tolerable.

Speaking of debug - I've recently seen similar slowness in debug with the same compiler in another project. I suspect it's due to newer STL features - I think in the past few years (in contrast with before), Microsoft has made a big push to get standards compliant and up to date. The problem I've seen is they often roll out very inefficient changes just to get something working out the door.

I'm trying a couple of hunches, if they don't pan out, settling for Release w/ debug info for msvcp until we really need full unoptimized debug is a decent workaround (in VS2019, open CMakeSettings.json to change the Configuration type from Debug to Release).

KevinHake avatar Apr 29 '21 01:04 KevinHake

So I think I know what's going on: The slow part is in the utf8 library - the TLDR; after we run utf8::is_valid on our input text and have verified that the text is indeed valid utf8, we are using the utf8::checked interface when we should be using utf8::unchecked (otherwise every action the iterator makes is going to re-check for validity). I'm going to try to do that and see what happens.

Long version: I saw we spend a lot (28%) of our time in calls to utf8/core.h:223 validate_next, which is checking to make sure the next bytes are valid utf8. We want to validate that input, which is why in tdf.cpp::20 we do: if (!utf8::is_valid(input.begin(), input.end()))

BUT - the profiler only shows that check taking 4% of the total time... it turns out validate_next is being called way more /after/ we've already verified the input, inside parser.parse

There are two interfaces provided by the utf8 library - utf8::unchecked and utf8::checked. The checked one will re-check for validity with every call to peek, next, etc, and that's what's being used right now. So I think iterating over the string ends up doing many many times more work than necessary. The fix is to make sure after we've done the utf8::is_valid() check for the entire input, we are safe to use the utf8::unchecked interface. (If it's still slow, maybe we should look at a better library)

KevinHake avatar Apr 29 '21 06:04 KevinHake

Changing to the unchecked iterator helped some. I also optimized some of the loops in the tdf parser. Release: 2s (was 5s) Debug: 1min55s (was 5 min)

Still pretty slow. Maybe could get it down more by adopting more of a state machine and eliminating recursion where possible. Still not certain what exactly is the weakest point in the debug mode... it just looks like across the board it's 60x slower.

In some discussion on the forum, another avenue is to have the parser itself assume tdf input is always latin1 (which I think it is in TA), and only when storing parsed data internally bother with utf-8 conversion for chars > 127. From that perspective I think I could reduce the code a bit more. At a later date a utf-8-only mode could be added for tdf files annotated at the top to label them as such.

KevinHake avatar Apr 30 '21 03:04 KevinHake