chiavdf icon indicating copy to clipboard operation
chiavdf copied to clipboard

Stop 1-Weso squaring on reaching target iterations

Open xearl4 opened this issue 2 years ago • 6 comments

For 1-Wesolowski proofs (used for "blueboxing") the squaring thread previously needlessly continued squaring while proof computation was in progress. In practice, proof computation also takes a while, so this could easily result in 10-20M additional squaring iterations. With this change, the squaring stops after the iterations target is reached. This also makes CPU utilisation more predictable. Previously, during squaring 2 threads were heavily utilised, and 3 threads during proof computation. Now it's never more than 2 high-load threads: 2 for squaring, 1 for proof computation.

This was originally written more than 2 years ago for the "blueboxing group" we ran in a SpaceFarmers.io Discord channel and was used by the people participating in that group intensively over several months. With the recently renewed interest in blueboxing, I remembered that I never got around to upstreaming that change: so here we go :)

xearl4 avatar Apr 04 '24 19:04 xearl4

Awesome!

hoffmang9 avatar Apr 04 '24 20:04 hoffmang9

The macos TSAN (thread sanitizer) test does seem to be failing - this only runs on macOS intel so perhaps it's usefulness is declining. The Ubuntu TSAN test passed - so this is a curious failure

emlowe avatar Apr 04 '24 21:04 emlowe

The macOS TSAN tests passes for me in main of chiavdf, but does not with this change. Although I can't quite figure out why yet since it seems to be complaining about std::cout

==================
WARNING: ThreadSanitizer: data race (pid=5847)
  Read of size 4 at 0x7ff8569a1348 by main thread:
    #0 std::__1::basic_ostream<char, std::__1::char_traits<char> >& std::__1::__put_character_sequence<char, std::__1::char_traits<char> >(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, char const*, unsigned long) <null>:2 (1weso_test:x86_64+0x10002bbe8)
    #1 main <null>:2 (1weso_test:x86_64+0x10003e279)
    #2 start <null>:2 (dyld:x86_64+0x552d)
    #3 start <null>:2 (dyld:x86_64+0x552d)

  Previous write of size 4 at 0x7ff8569a1348 by thread T1:
    #0 std::__1::basic_ostream<char, std::__1::char_traits<char> >& std::__1::__put_character_sequence<char, std::__1::char_traits<char> >(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, char const*, unsigned long) <null>:2 (1weso_test:x86_64+0x10002bc45)
    #1 repeated_square(unsigned long long, form, integer const&, integer const&, WesolowskiCallback*, FastStorage*, std::__1::atomic<bool>&) <null>:2 (1weso_test:x86_64+0x100039fa7)
    #2 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (*)(unsigned long long, form, integer const&, integer const&, WesolowskiCallback*, FastStorage*, std::__1::atomic<bool>&), unsigned long long, form, integer, integer, OneWesolowskiCallback*, FastStorage*, std::__1::reference_wrapper<std::__1::atomic<bool> > > >(void*) <null>:2 (1weso_test:x86_64+0x100040aa7)

  As if synchronized via sleep:
    #0 nanosleep <null>:2 (libclang_rt.tsan_osx_dynamic.dylib:x86_64+0x2ccc5)
    #1 std::__1::this_thread::sleep_for(std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l> > const&) <null>:2 (libc++.1.dylib:x86_64+0x15ad0)
    #2 start <null>:2 (dyld:x86_64+0x552d)
    #3 start <null>:2 (dyld:x86_64+0x552d)

  Location is global 'std::__1::cout' at 0x7ff8569a12b0 (libc++.1.dylib+0x4179a348)

  Thread T1 (tid=25547, finished) created by main thread at:
    #0 pthread_create <null>:2 (libclang_rt.tsan_osx_dynamic.dylib:x86_64+0x2df1f)
    #1 main <null>:2 (1weso_test:x86_64+0x10003da3f)
    #2 start <null>:2 (dyld:x86_64+0x552d)
    #3 start <null>:2 (dyld:x86_64+0x552d)

SUMMARY: ThreadSanitizer: data race (1weso_test:x86_64+0x10002bbe8) in std::__1::basic_ostream<char, std::__1::char_traits<char> >& std::__1::__put_character_sequence<char, std::__1::char_traits<char> >(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, char const*, unsigned long)+0xb8
==================

emlowe avatar Apr 05 '24 00:04 emlowe

Ok, if I comment out the line in vdf.h std::cout << "VDF loop finished. Total iters: " << num_iterations << "\n" << std::flush;

The tests pass again.

emlowe avatar Apr 05 '24 00:04 emlowe

This seems to be a TSAN bug in this version of Clang , as per the C++ spec:

In C++11, we do have some guarantees. The FDIS says the following in §27.4.1 [iostream.objects.overview]:

Concurrent access to a synchronized (§27.5.3.4) standard iostream object’s formatted and unformatted input (§27.7.2.1) and output (§27.7.3.1) functions or a standard C stream by multiple threads shall not result in a data race (§1.10). [ Note: Users must still synchronize concurrent use of these objects and streams by multiple threads if they wish to avoid interleaved characters. — end note ]

emlowe avatar Apr 05 '24 00:04 emlowe

@xearl4 I added a lock around cout to prevent the TSAN false positive - it really shouldn't be needed, but 🤷 - too bad we don't have the convenient c++20 <syncstream> - but that would seem a larger change to require c++20

see #176

emlowe avatar Apr 05 '24 16:04 emlowe

This needs #181 merged in first to fix TSAN

emlowe avatar May 13 '24 23:05 emlowe

close and re-open for CI updates

emlowe avatar May 14 '24 14:05 emlowe