xsimd icon indicating copy to clipboard operation
xsimd copied to clipboard

Multiple constructor calls for one object initialization

Open wermos opened this issue 2 years ago • 6 comments

Once again, as I pointed out in #775, a single xsimd::batch object initialization ends up calling multiple constructors:

I took the liberty of putting std::cout statements in the batch constructor code, like so:

/**
 * Create a batch with all element initialized to \c val.
 */
template <class T, class A>
inline batch<T, A>::batch(T val) noexcept
    : types::simd_register<T, A>(kernel::broadcast<A>(val, A {}))
{
    std::cout << "broadcasting constructor\n";
}

/**
 * Create a batch with elements initialized from \c data.
 * It is an error to have `data.size() != size.
 */
template <class T, class A>
inline batch<T, A>::batch(std::initializer_list<T> data) noexcept
    : batch(data.begin(), detail::make_index_sequence<size>())
{
    details::check_batch_init(*this, data);
    std::cout << "intializer list constructor\n";
}

And then ran a little test program:

#include <iostream>
#include <xsimd/xsimd.hpp>

int main() {
    xsimd::batch<double, xsimd::avx2> b(2.5);
    std::cout << "==========================\n";
    xsimd::batch<double, xsimd::avx2> c{2.5, 5, 7.5, 10};
    std::cout << "==========================\n";
    xsimd::batch<double, xsimd::avx2> d = {3.5, 7, 10.5, 14};
}

And this was the output:

register_type constructor
broadcasting constructor
==========================
register_type constructor
pointer-to-data constructor
intializer list constructor
==========================
register_type constructor
pointer-to-data constructor
intializer list constructor

We should fix this. Apart from the obvious gain in performance in debug builds, we should also see a small performance increase in Release builds (in my experiments, I got the same behavior even in Release mode).

wermos avatar Jun 24 '22 09:06 wermos

Where does the "pointer-to-data constructor" output come from?

Also I guess this issue should be solved when we merge the variadic constructor implementation. Could you check if you have the same kind of issue on this branch?

JohanMabille avatar Jun 24 '22 09:06 JohanMabille

Where does the "pointer-to-data constructor" output come from?

I put std::cout statements on every single constructor. I only reproduced two of them here, as examples. If you want, I could push this code in a branch on my fork and link it here.

Also I guess this issue should be solved when we merge the variadic constructor implementation. Could you check if you have the same kind of issue on this branch?

Will do.

wermos avatar Jun 24 '22 09:06 wermos

Also I guess this issue should be solved when we merge the variadic constructor implementation. Could you check if you have the same kind of issue on this branch?

Will do.

It seems that the branch isn't ready for testing yet. I'm getting compilation errors currently. I'll try once again once the branch is a bit more mature.

wermos avatar Jun 24 '22 10:06 wermos

(in my experiments, I got the same behavior even in Release mode).

@wermos worth nothing that might actually be because of your std::cout statements - I suspect that in Release builds a completely delegating constructor (i.e. one with no extra code in the function body) will be completely folded in most situations.

You are definitely correct about debug perf here, though. When I step through the ctor init chain using MSVC it steps through each call individually.

marzer avatar Jun 24 '22 10:06 marzer

@wermos worth nothing that might actually be because of your std::cout statements - I suspect that in Release builds a completely delegating constructor (i.e. one with no extra code in the function body) will be completely folded in most situations.

Ah yes, you're right. In Release mode, none of the unnecessary constructor calls happened, once I removed the print statements (I checked using gdb.)

wermos avatar Jun 24 '22 14:06 wermos

Also I guess this issue should be solved when we merge the variadic constructor implementation. Could you check if you have the same kind of issue on this branch?

Will do.

It seems that the branch isn't ready for testing yet. I'm getting compilation errors currently. I'll try once again once the branch is a bit more mature.

I redid the process with the exact same test program, and here is the result:

register type constructor
single element constructor
==========================
register type constructor
variadic template constructor
==========================
register type constructor
variadic template constructor

It appears that this constructor:

template <class T, class A>
inline batch<T, A>::batch(register_type reg) noexcept
    : types::simd_register<T, A>({ reg })
{
    // std::cout << "register type constructor\n";
}

always runs before any other constructor.

wermos avatar Jul 09 '22 15:07 wermos

Back on this one. I understand that debug builds are slower with delegating constructors, but that's the case for any delegating constructor, right? And delegating constructor avoid code duplication. Actually, the following trace extracted from above

register_type constructor pointer-to-data constructor intializer list constructor

Makes a lot of sense to me:

  1. access contiguous memory from intiializer_list
  2. load that into a register type
  3. store that register type as an object member.

Those are three different steps and I'm glad I can rely on a different constructor to acheive each step, and don't need to duplicate the logic for doing so.

serge-sans-paille avatar Dec 13 '22 20:12 serge-sans-paille

Closing as it makes no sense to me to optimize debug build at the expense of code duplication.

serge-sans-paille avatar Jun 06 '23 19:06 serge-sans-paille

it makes no sense to me to optimize debug build at the expense of code duplication.

This is true in the general case, but a simd library is, by it's very nature, performance-oriented. In my experience your evaluation is misguided - debug build performance is always worth thinking about! 😢

marzer avatar Jun 06 '23 19:06 marzer