HPX Backend
This includes:
- ~Rename
interfacetointerface_(avoid conflict with some platforms that haveinterfacedefined as a macro)~ added#undef interfaceinstead - Fixing various errors and warnings
- adding build system support for HPX
- adding missing gettimeofday for windows
- replace
usleep/sleepwithstd::this_thread_sleep_for() - workaround for MSVC early template instantiation problem (annotation.hh)
@ipdemes: I just opened this PR for you to follow the progress. It will take more work, however. I'll remove the WIP tag as soon as it's ready to be reviewed.
I don't like renaming interface to an ugly_ name, since (not counting the one unit test that happens to use it) it is in fact exposed to specialization authors (albeit not to application developers). I know we've been down a very similar road with max/min, but should we just #undef it on whatever broken platform (with the caveat that users there would have to make sure not to #include anything after FleCSI headers that might depend on it)? If not, I think we ought to (grudgingly) pick a different word (maybe just access like the related class template in the core topology).
Hartmut,
Thank you!
Irina
From: Hartmut Kaiser @.***> Sent: Tuesday, May 25, 2021 4:26:07 PM To: flecsi/flecsi Cc: Demeshko, Irina P; Mention Subject: [EXTERNAL] Re: [flecsi/flecsi] [WIP] HPX Backend (#1)
@ipdemeshttps://github.com/ipdemes: I just opened this PR for you to follow the progress. It will take more work, however. I'll remove the WIP tag as soon as it's ready to be reviewed.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/flecsi/flecsi/pull/1#issuecomment-848309749, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AB4ZQSCRKVDXDQT4S3PNCGDTPQPX7ANCNFSM45QKSZBQ.
I don't like renaming
interfaceto anugly_name, since it is in fact exposed to specialization authors (albeit not to application developers). I know we've been down a very similar road withmax/min, but should we just#undefit on whatever broken platform (with the caveat that users there would have to make sure not to#includeanything after FleCSI headers that might depend on it)? If not, I think we ought to (grudgingly) pick a different word (maybe justaccesslike the related class template in the core topology).
I'll go with whatever you decide. FWIW, I don't like the ugly_ myself.
I'll remove the WIP tag as soon as it's ready to be reviewed.
I couldn't help myself, as you can see, but I tried to limit my comments to code that looked "done".
@ipdemes I have removed the WIP tag from this as I think the PR has reached a minimal viable state. The tests from the run and exec submodules compile and run now.
@hkaiser : thank you!
@ipdemes I have updated the PR with recent changes from https://github.com/flecsi/flecsi/tree/2. All tests pass for me now. Please let me know if there is anything else you would like me to change.
@ipdemes I have updated the PR with recent changes from https://github.com/flecsi/flecsi/tree/2. All tests pass for me now. Please let me know if there is anything else you would like me to change.
The one thing I changed was to add -DENABLE_MPI_THREAD_MULTIPLE=On to my flecsi build.
@hkaiser thank you! @ollielo can you, please, pull and try to build it ?
@hkaiser
Do you mean when building HPX or building FleCSI?
@hkaiser
Do you mean when building HPX or building FleCSI?
I think it should work without any additional settings. For both, HPX and FleCSI the corresponding options should be enabled by default. Nevertheless I did specify the MPI multi-threaded options:
- HPX:
-DCMAKE_BUILD_TYPE=<Debug/Release> -DCMAKE_INSTALL_PREFIX=<...> -DHPX_WITH_PARCELPORT_MPI=ON -DHPX_WITH_PARCELPORT_MPI_MULTITHREADED=ON -DHPX_WITH_FETCH_ASIO=On -DHPX_WITH_EXAMPLES=Off -DHPX_WITH_TESTS=Off -DHPX_WITH_CXX17=On - FleCSI:
-DCMAKE_BUILD_TYPE=<Debug/Release> -DCMAKE_INSTALL_PREFIX=<...> -DFLECSI_RUNTIME_MODEL=hpx -DENABLE_MPI_CXX_BINDINGS=On -DENABLE_MPI_THREAD_MULITPLE=On -DENABLE_FLOG=On -DCALIPER_DETAIL=high -Dcaliper_DIR=<...> -DHPX_DIR=<...>
@hkaiser
FleCSI it self does not use the ENABLE_MPI_THREAD_MULTIPLE macro. Does it impact some of the HPX header files?
@hkaiser
FleCSI it self does not use the ENABLE_MPI_THREAD_MULTIPLE macro. Does it impact some of the HPX header files?
No, the HPX backend uses multi-threaded MPI in FleCSI by default. Seems that option can be removed, then.
Seems that option can be removed, then.
I believe @tuxfan is concurrently removing CMake support for that macro.
A lot of documentation work has just been merged into 2. There are now release notes wherein to advertise the HPX backend, and there may be some good places to add Doxygen (although the backends need much less of that than does the front-end code).
We do need release notes and an update to
options.rsteven if no other documentation is required because it works just like the other backends.
I updated options.rst (added mentioning of HPX). Where do release notes go?
I updated options.rst (added mentioning of HPX). Where do release notes go?
../news.rst (relative to options.rst). I realize that my creation of the "Changes in v2.2" section hasn't been merged yet; I can do that quickly if you like, or we can come back to this at the end.
Shouldn't we have a Spack variant for HPX?
Shouldn't we have a Spack variant for HPX?
We do have HPX support in Spack.
We do have HPX support in Spack.
Right, sorry. I've been reading lots of documentation lately and not much code, and we don't document the backend=hpx variant. I confused that with the actual code, but we do need to make minor updates to the documentation:
- [x] the very top of
doc/sphinx/src/build.rst - [x] "Parallelization options" in
doc/sphinx/src/build/options.rst - [x]
doc/sphinx/src/news.rst(finally!), in the structural comments at the top and as a new v2.2.0 feature - [x] ideally, in
doc/sphinx/src/summary.rstsomewhere near "The MPI backend uses MPI to transfer data" and "The MPI backend supports only"
@opensdh this PR is now ready for final review. Thanks!
@hkaiser
Current version breaks the MPI backend. Did you make the needed change to it?
@hkaiser
Current version breaks the MPI backend. Did you make the needed change to it?
Will check, thanks!
@hkaiser Current version breaks the MPI backend. Did you make the needed change to it?
Will check, thanks!
That should be fine now. Sorry for this!
@opensdh I have addressed the comments and squashed the commits into one. The commit message now reflects (most of) the changes made overall.
@opensdh I have addressed the comments and squashed the commits into one. The commit message now reflects (most of) the changes made overall.
So you're ready to "release the lock" and let us test it internally (and fix documentation, etc.)?
@hkaiser
It still breaks the mpi backend. First there is one line change need to compile
[ollie@pn2200824 cmake-build-debug]$ git diff
diff --git a/flecsi/io/mpi/policy.hh b/flecsi/io/mpi/policy.hh
index 62694c82a..f0fa067c5 100644
--- a/flecsi/io/mpi/policy.hh
+++ b/flecsi/io/mpi/policy.hh
@@ -156,7 +156,8 @@ struct io_interface {
int item_size = fp->type_size;
std::string field_name = region_name + " field " + std::to_string(fid);
- const auto data = isd.partition->get_raw_storage(fid, item_size);
+ // FIXME: a shorter/better way to call get_raw_storage?
+ const auto data = (*isd.partition)->get_raw_storage(fid, item_size);
hsize_t size = data.size() / item_size;
void * buffer = data.data();
checkpoint_field<W>(
It also seg fault when running the index unit test.
7: AddressSanitizer:DEADLYSIGNAL
7: =================================================================
7: ==203174==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x000000800d84 bp 0x7ffd97c607d0 sp 0x7ffd97c606a0 T0)
7: ==203173==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x000000800d84 bp 0x7ffe550d7630 sp 0x7ffe550d7500 T0)
7: ==203173==The signal is caused by a READ memory access.
7: ==203174==The signal is caused by a READ memory access.
7: ==203174==Hint: address points to the zero page.
7: ==203173==Hint: address points to the zero page.
7: #0 0x800d84 in auto flecsi::data::local::partition_impl::get_storage<Noisy, (flecsi::exec::task_processor_type_t)0>(unsigned long) const /home/ollie/CLionProjects/flecsi/flecsi/data/local/policy.hh:127:12
7: #1 0x8060c5 in void flecsi::exec::task_prologue<(flecsi::exec::task_processor_type_t)0>::visit<Noisy, 7u, flecsi::topo::index, (flecsi::topo::single_space)0>(flecsi::data::accessor<(flecsi::data::layout)0, Noisy, 7u>&, flecsi::data::field_reference<Noisy, (flecsi::data::layout)0, flecsi::topo::index, (flecsi::topo::single_space)0> const&) /home/ollie/CLionProjects/flecsi/flecsi/exec/mpi/task_prologue.hh:95:17
7: #2 0x805e98 in auto auto flecsi::exec::prolog<(flecsi::exec::task_processor_type_t)0>::visitor<flecsi::data::field_reference<Noisy, (flecsi::data::layout)2, flecsi::topo::index, (flecsi::topo::single_space)0> const>(flecsi::data::field_reference<Noisy, (flecsi::data::layout)2, flecsi::topo::index, (flecsi::topo::single_space)0> const&)::'lambda'(flecsi::data::field_reference<Noisy, (flecsi::data::layout)2, flecsi::topo::index, (flecsi::topo::single_space)0> const&, auto&&)::operator()<flecsi::data::accessor<(flecsi::data::layout)0, Noisy, 7u>, void flecsi::data::accessor<(flecsi::data::layout)2, Noisy, 7u>::send<'lambda'(flecsi::data::field_reference<Noisy, (flecsi::data::layout)2, flecsi::topo::index, (flecsi::topo::single_space)0> const&, auto&&)>(flecsi::data::field_reference<Noisy, (flecsi::data::layout)2, flecsi::topo::index, (flecsi::topo::single_space)0> const&&)::'lambda'(flecsi::data::field_reference<Noisy, (flecsi::data::layout)2, flecsi::topo::index, (flecsi::topo::single_space)0> const const&)>(flecsi::data::field_reference<Noisy, (flecsi::data::layout)2, flecsi::topo::index, (flecsi::topo::single_space)0> const&, auto&&) const /home/ollie/CLionProjects/flecsi/flecsi/exec/prolog.hh:64:34
7: #3 0x805aa6 in void flecsi::data::accessor<(flecsi::data::layout)2, Noisy, 7u>::send<auto flecsi::exec::prolog<(flecsi::exec::task_processor_type_t)0>::visitor<flecsi::data::field_reference<Noisy, (flecsi::data::layout)2, flecsi::topo::index, (flecsi::topo::single_space)0> const>(flecsi::data::field_reference<Noisy, (flecsi::data::layout)2, flecsi::topo::index, (flecsi::topo::single_space)0> const&)::'lambda'(flecsi::data::field_reference<Noisy, (flecsi::data::layout)2, flecsi::topo::index, (flecsi::topo::single_space)0> const&, auto&&)>(flecsi::data::field_reference<Noisy, (flecsi::data::layout)2, flecsi::topo::index, (flecsi::topo::single_space)0> const&&) /home/ollie/CLionProjects/flecsi/flecsi/data/accessor.hh:241:5
...
Is the r in partition_impl properly inited?
@ollielo Thanks for testing. I have not run the tests with the MPI backend yet - I will take a look.
It actually crashes at the destructor/finializer in the cleanup(). Hope this helps.