cppcheck icon indicating copy to clipboard operation
cppcheck copied to clipboard

astutils.h: use pre-sized SmallVector in visitAstNodes()

Open firewave opened this issue 3 years ago • 4 comments

This is the first of two PR introducing the usage of SmallVector in our code.

I adjusted the value locally based on profiling it with callgrind. When using Clang and Boost the indicator that we need to increase the size of the vector are boost::container::vec_iterator<> boost::container::vector<>::priv_forward_range_insert_no_capacity<>(...) and operator delete(void*) calls in findAstNode(...). When using no Boost and reserve() the indicator is amount of additional operator new(...) and operator delete(...) compared to the actual calls of findAstNode(...). Update: With findAstNode() now being inlined it is now harder to trace this. You now need to follow the operator new() or operator delete() but you won't have a function or proper call count to compare against. It seem like memcpy@GLIBC_2.2.5 is now the indicator that the vector is being expanded and needs to reserve more space.

I still need to run a bigger test to see if we might need to increase the count even more.

Here's some numbers for a -O2 build using callgrind with --enable=all --inconclusive on the mame_regtest project, DISABLE_VALUEFLOW=1.

Clang 13 2,263,092,948 - std::vector 2,242,113,164 - std::vector (reserve(8)) 2,263,074,159 - SmallVector (no Boost / size 8) 2,242,113,077 - SmallVector (no Boost / size 8 / reserve(8)) 2,162,604,179 - SmallVector (Boost 1.74 / size 8) 2,167,203,248 - SmallVector (Boost 1.74 / size 8 / reserve(8))

GCC values coming soon. ~I will file a ticket about the GCC difference soon.~

firewave avatar Mar 20 '22 10:03 firewave

The Visual Studio compilation failure was caused by the incorrect C++ standard being using by the *-PCRE configurations in the cli.vcxproj.

I opened #3937 for the compilation/selfcheck fixes.

firewave avatar Mar 25 '22 10:03 firewave

@Ken-Patrick @pfultz2 @danmar

There's another intermediate step it seems. Using reserve() on the underlying container helps to reduce unnecessary allocations if we are not using Boost.

    std::vector<T *> tokensContainer;
    tokensContainer.reserve(8);
    std::stack<T *, std::vector<T *>> tokens(std::move(tokensContainer));

Unfortunately that causes a performance regression when using Boost so that code requires to be conditional which is messy.

I wonder if what the reserve() call does could be achieved through the TaggedAllocator when using SmallVector without Boost as we are currently not using the given size. Unfortunately I have zero experience with custom std::allocator.

I provided values for the various cases in the description.

firewave avatar May 28 '22 15:05 firewave

As std::vector::reserve() already provides a sizable improvement without need for any external dependencies I prepared #4158 which implements until SmallVector provides the same advantage in the non-Boost implementation.

firewave avatar May 31 '22 21:05 firewave

With GCC I still see (all?) calls to void boost::container::vector<>::priv_push_back<>(Token const* const&) during profiling. It seems boost::container::small_vector is not working as expected with GCC.

firewave avatar Jun 03 '22 07:06 firewave

This is almost done. 🙂 I still need to test it with the latest boost, add a build step to the CI and integrate it with the Visual Studio build (since it doesn't add any dependencies the official binaries should utilize it).

Here's a small preview of the performance win:

Clang 14 51,333,013,384 -> 37,000,473,137 GCC 12 55,161,312,808 -> 37,878,403,473

firewave avatar Aug 23 '22 16:08 firewave

I enabled the Boost usage for the sanitizer builds. That tests the build and runs the tests as well as (hopefully) providing a slight performance improvement for the sanitized runs.

The last remaining thing is to build the Windows version with Boost.

firewave avatar Sep 08 '22 09:09 firewave

This is finally ready for review. 🥳 I think the very slight Clang regression without Boost can be dismissed.

current -> SmallVector -> SmallVector + Boost 1.74

Performing a --enable=all --inconclusive scan on mame_regtest.

With DISABLE_VALUEFLOW=1:

Clang 14 2,026,151,486 -> 2,026,159,272 -> 1,854,855,377 GCC 12 1,994,489,219 -> 1,943,410,321 -> 1,842,068,039

With Valueflow enabled:

Clang 14 51,130,767,151 -> 51,131,952,779 -> 32,275,080,333 GCC 12 54,989,579,042 -> 49,360,382,494 -> 37,944,747,061

And scanning cli with --enable=all --inconclusive -Ilib -D __GNUC__ and DISABLE_VALUEFLOW=1

Clang 14 2,935,467,131 -> 2,935,473,066 -> 2,800,613,440 GCC 12 3,009,733,699 -> 2,968,684,311 -> 2,841,416,862

firewave avatar Sep 28 '22 12:09 firewave