CoreNeuron
CoreNeuron copied to clipboard
Introduce robust and scalable logging
The Problem
Currently logging in coreneuron is done through a variety of printf
s, DEBUG
ifdefs and a bit of MPI to print only on rank 0. This code is hard to maintain, messy, limiting and the output is ugly.
Considerations
- There are several powerful C++ logging frameworks around. We have mostly experience with spdlog, which has served us so far very well (in other projects.
- The difficulty here is that we are running coreneuron simulations almost exclusively in MPI, but spdlog does not include any direct MPI support
- We would like to have different types of logging messages, information, error, debug and they should be switched at runtime
- it should be possible to log everywhere, on rank 0 or on another specific rank. Ideally there should be multiple log objects that one can create that may log on the specific ranks
Example:
auto root_logger = spdlog::basic_logger_mt("basic_logger", 0); // rank 0 or MPI COMM SELF ?
auto all_logger = spdlog::basic_logger_mt("basic_logger", LOG_ALL); // or MPI COMM WORLD?
spdlog::set_default_logger(root_logger);
- The log objects probably don’t need to do much more than guarding the actual logging in an if statement
- If all ranks (or all ranks in a communicator are logging a message, the messages, the logger has to take care that messages are coordinated and don’t override each other, this could be achieved through a pattern like so
for (int rank=0; rank<num_ranks; rank++) {
if (rank == myrank) {
log::debug(msg);
}
MPI_Barrier(comm);
}
(i.e. logger shouldn't worry about buffer caching and MPI synchronisations etc.)
Caveats
- Log aggregation is not in the scope of this issue, so I would for now not worry too much about this
- should we already support logging to file? spdlog has good support for this but it’s not clear how this would be done without log aggregation in an mpi setting, and maybe this isn’t necessary to do quite yet.
This proposal look fine to me (some minor tweaks we can do later).
Note :
MPI_Barrier
based loop is needed for not mixing mpi messages but as you know that's not efficient as well. But we can discuss/decide later what to do about that.