BOUT-dev icon indicating copy to clipboard operation
BOUT-dev copied to clipboard

Replace DataFile/Dataformat with OptionsNetCDF

Open ZedThree opened this issue 4 years ago • 19 comments

Replace all the existing DataFile/Dataformat infrastructure with OptionsNetCDF

Closes #1254

Currently:

  • restart files moved to OptionsNetCDF
  • additional BOUT.dmp.experimental.*.nc files using OptionsNetCDF -- separate files for now just for testing!
  • time-evolving variables added to Solver are written
  • build config is written
  • method for checking all time-evolving variables have the same sized time dimension

Missing:

  • ~user-added variables, using e.g. SAVE_ONCE or SAVE_REPEAT~
  • upgrade tool -- EDIT: this might not happen, as it's seems quite tricky to automatically upgrade
  • direct read/write method -- direct read not yet implemented, but direct write can be done with e.g. OptionsNetCDF{}.write({{"key": value}})
  • ~single-method for adding a time-evolving variable to Options~

One possibly blocking issue at the moment is that OptionsNetCDF writes all the keys in lowercase. This causes problems with e.g. collect, which expects BOUT_VERSION to be in uppercase, so thinks the files are pre-v0.2! That's a simple fix for that one variable in collect, but I am a little bit worried more generally about what tools expect certain variables in the files to be written in uppercase. @johnomotani what does xBOUT do?

We also need to decide on exactly what the interface will be for users to add variables. In the current form, there's now an Options member of PhysicsModel, output_options, which gets written to file in PhysicsModel::PhysicsModelMonitor::call. My feeling is we probably want a PhysicsModel::outputVars(Options&) method which gets called by PhysicsModelMonitor.

ZedThree avatar Jan 13 '21 18:01 ZedThree

what tools expect certain variables in the files to be written in uppercase. @johnomotani what does xBOUT do?

xarray/xBOUT are case sensitive at the moment. I guess we could work around by renaming everything to lower-case when we load. I'd prefer to keep the output case-sensitive though (otherwise for example variable names will be converted to lower-case when printed, which I would find surprising as a user).

Actually I would vote for making Options case-sensitive (I find that behaviour less surprising myself, and it makes the code slightly simpler), although it's not backward compatible. If we make warnings about unused options more prominent (e.g. print at the beginning of the run, after the first rhs call, and also at the end of the run), I think that would mitigate the worry about getting unexpected behaviour because an option was typed with the wrong case.

johnomotani avatar Jan 13 '21 18:01 johnomotani

I do think case-sensitive Options should be considered, though I would be more keen on it if we had a better way of dealing with unused/unknown/mistyped options.

If we didn't want to do that, we could probably modify Options to keep the original case but reads were case-insensitive by changing the option name from std::string to something like:

struct CaseInsensitiveKey {
  std::string original_case; // Store the original case as well as the lower case version
  std::string lower_case;
  
  CaseInsensitiveKey(const std::string& key) : original_case(key), lower_case(lowercase(key)) {};

  // Implicitly converts to std::string
  operator const std::string& () const { return original_case; }

  // Order by the original case -- needed for std::map
  bool operator< (const std::string& other) { return other < original_case; }

  // When checking for equality, check the original case, then lowercase
  bool operator== (const std::string& other) {
    if (other == original_case) {
      return true;
    }
    return lowercase(other) == lower_case;
  }
};

Also needs assignment, copy, move operators, but that would probably allow us to keep the case-insensitivity when reading the options file, but have case sensitivity when writing. But more code, more chance for things to go wrong.

ZedThree avatar Jan 14 '21 11:01 ZedThree

I agree it's tricky, and case sensitivity simplifies lots of things in the code. On the other hand I would not want to have keys which differed by case, or users thinking they had changed an input where they hadn't (because they got the case wrong).

I worry that ordering by original case, and then comparing against both strings would break the behavior of maps e.g. a red-black tree might not match the case insensitive case if they've followed the ordering using the original case.

Perhaps a variation on your suggestion:

struct CaseInsensitiveKey {
  std::string original_case; // Store the original case as well as the lower case version
  std::string lower_case;
  
  CaseInsensitiveKey(const std::string& key) : original_case(key), lower_case(lowercase(key)) {};

  // Implicitly converts to std::string
  operator const std::string& () const { return original_case; }

  // Order by the lower case -- needed for std::map
  bool operator< (const std::string& other) { return lowercase(other) < lower_case; }

  // When checking for equality, check lowercase
  bool operator== (const std::string& other) {
    return lowercase(other) == lower_case;
  }
};

So we do all comparisons in lower case, but retain the original case

bendudson avatar Jan 14 '21 12:01 bendudson

Quick question: If the keys to the map are CaseInsensitiveKey do we need comparisons to strings, or just between CaseInsensitiveKeys? That would eliminate conversion to lower case, apart from in the constructor

bendudson avatar Jan 14 '21 13:01 bendudson

If we went case-sensitive, how about something like this to help avoid errors:

  • Have an option like strict (true or false, default to false), that makes it an error to run with unused options - that should catch any typos or attempts to use the wrong case. I think Options already has some method to flag unused options - we could check whether there are any after the first rhs call and raise an error if there are, printing the unused options.
  • An option like validate_input. To start with this could be equivalent to just nout=0 strict, so that a user can check that their input file is OK by running something like
    ./mymodel validate_input
    
    Over time we could maybe add some extras, like if there is an unused option searching for similar options (e.g. different case, one-letter typos) and ask if those were intended.

I agree keys that differ by case are horrible, but I'd be inclined to leave it as just bad practice rather than make it an error. If you'd rather be certain, we could add a check like if (new_key.lower() == existing_key.lower()) throw BoutException(...)?

johnomotani avatar Jan 14 '21 13:01 johnomotani

I think the strict check @johnomotani suggests would be useful whether or not keys are case sensitive. I'd support making that true by default. In which case that removes one of the arguments for case insensitivity. I quite like your idea about checking when creating keys if they match existing keys case-insensitively, but currently keys are created automatically if they don't already exist, so this might need to be done every time an Options operator[] is called, with a scan and comparison against all the keys in the dictionary. Perhaps this is not needed if we're catching unused keys. Perhaps this is something as you say that should be a good-practice thing rather than implemented in the code. We'd need to decide some convention on when to use upper case (or always lower case?)

bendudson avatar Jan 14 '21 13:01 bendudson

this might need to be done every time an Options operator[] is called

I think it'd only be needed when (possibly) creating a new key - if the requested key is already in the map, then we'd return before getting to the check.

johnomotani avatar Jan 14 '21 13:01 johnomotani

Ok, I will make a separate PR with modifications to case sensitivity:

  • remove case insensitivity (breaking change)
  • add strict (or better: input:strict?) option, defaulting to true, which checks for unused options
    • this could be backported, with default false
    • where should check be? In Solver::setModel, after call to PhysicsModel::initialise? Or in Solver::solve before call to Solver::run?
  • add validate_input (or better: input:validate?), which just stops after initial strict check (and crucially, doesn't write to file!)

Adding a simple Levenshtein distance would allow us to have more helpful feedback: "tpyo_option not found: did you mean typo_option?"

ZedThree avatar Jan 14 '21 15:01 ZedThree

Thanks @ZedThree this sounds good thanks!

  • I vote for input:strict and input:validate to keep things namespaced, and gradually reduce the number of switches outside sections
  • It needs to be checked after the RHS has been called once. Perhaps here: https://github.com/boutproject/BOUT-dev/blob/master/src/solver/solver.cxx#L476 but we would need to move the run_rhs call to before the dump_on_restart check, so that it is always run.
  • Definitely yes to making the inputs more helpful. For each unused key, we could loop through the keys which were set (presumably to default values) and suggest the closest match.

bendudson avatar Jan 14 '21 15:01 bendudson

input:strict sounds good, and I guess input:validate is the same number of characters as validate_input, so is better. To make validate more discoverable, could we maybe add a command-line option that comes up when you do ./mymodel --help?

johnomotani avatar Jan 14 '21 15:01 johnomotani

Yes, good idea! I'd really love for ./mymodel --help to list all the possible options, but I think that would require a massive overhaul of how we used options. It could probably be done using the init-time registration similar to how the factories work.

ZedThree avatar Jan 14 '21 16:01 ZedThree

We could have input:validate write the BOUT.settings file before exiting. That should then contain all the settings, types and documentation.

bendudson avatar Jan 14 '21 16:01 bendudson

Oh yes, that's probably good enough. Maybe just print it to screen, to avoid clobbering an existing BOUT.settings file?

ZedThree avatar Jan 14 '21 16:01 ZedThree

This is almost ready to remove the bout::globals::dump instance, as well all of Datafile and its implementations.

There are (at least!) two remaining sticking points:

  1. LaplaceNaulin and LaplaceIPT have a couple of performance diagnostics that they add to bout::globals::dump. These are a little bit tricky to refactor. My current idea is to copy an idea from LaplaceXY: have a savePerformance(Solver& solver) method that, when called, adds a Monitor to solver. Then the monitor has outputVars(Options&) where it can add whatever it likes. The issue is that savePerformance would have to be a virtual method on the base class for this to be useful really.
  2. The subsampling example probably needs to be completely rewritten, as it's heavily tied to the Datafile interface.

ZedThree avatar Jun 10 '21 14:06 ZedThree

I need to write some docs on all the changes, so not really ready to be merged, but otherwise the code is ready for review.

This is a pretty big PR, so I'd happy to break it up into some smaller pieces. The major changes, in rough dependency order, are:

  • modifications to Options:
    • assignRepeat to handle "time_dimension" attribute nicely
    • always store location/direction attributes for Fields
    • specialisations of Options::as<> for Array/Matrix/Tensor
  • modifications to OptionsNetCDF:
    • handle floats and shorts
    • better handling of multiple time dimensions
    • helper functions for writing the main/default output file (BOUT.dmp.X.nc)
  • use OptionsNetCDF instead of DataFormat in GridFile
  • add outputVars(Options&) to various classes
  • add savePerformance, monitor to Laplacian/LaplaceXY to replace use of bout::globals::dump
  • using OptionsNetCDF instead of bout::globals::dump in non-model based tests
  • use OptionsNetCDF instead of Datafile in PhysicsModel
    • also involves modifications to Solver to support this
    • DataFileFacade to make SAVE_ONCE/REPEAT still work inside PhysicsModel
    • fixes for boutcore
  • actual removal of Datafile/DataFormat

Most bullet points could be standalone PRs.

Even with all these changes, this PR is net -7k lines!

For a future PR: refactor GridDataSource.

ZedThree avatar Jun 16 '21 15:06 ZedThree

Failing test is clang-tidy-review because there's a bug in clang-tidy that makes it choke on LaplaceIPT for some reason.

ZedThree avatar Jun 17 '21 08:06 ZedThree

One of the things I don't especially like about the design here is the outputVars method that is now in lots of places:

  1. the name isn't great, I chose it because there were already existing methods with that name
  2. the Options& output_file is an in-out parameter

Both of these can be changed. The second one might require a bit more machinery -- we can write an Options to an existing file, appending new variables and overwriting existing ones, but it might be nicer to be able to merge into an existing Options and minimise the number of separate writes to disk.

I've also not benchmarked this PR against next. I suspect this may be marginally slower, but perhaps the reduction in complexity is worth it if it's not too much slower.

ZedThree avatar Jun 21 '21 13:06 ZedThree

Some timing information from blob2d, run on 16 cores, averaged over three runs (all the numbers have been averaged, I've not recalculated the "% of top" or "Mean time" columns to be consistent). I've added some extra timers to places. In the below tables, io is inclusive of the other io- rows, which are all exclusive subsets:

  • io-write: either OptionsNetCDF::write() or Datafile::write()
  • io-outputvars: every function named outputVars plus setupDumpFile in next
  • io-verify: OptionsNetCDF::verifyTimesteps -- this is called once per output timestep to check all time dimensions in the output file are the same length. This could be an input option to do this.

The following are non-exclusive subsets:

  • io-add: Datafile::add, so should be ~io-outputvars
  • io-ncxx4: anything inside the Ncxx4 class

Here's this branch:

Timer name Total time (s) % of top Hits Mean time/hit (s)
run 97.6347 1 1 97.6347
rhs 72.2145 0.74 1555 0.0464402
invert 16.883 0.17 1555 0.0108572
io 2.97422 0.03 138 0.0215523
io-write 2.55597 0.03 67 0.0381488
comms 0.549516 0.01 12616 4.35571e-05
io-findDimension 0.00349238 0 1045 3.34199e-06
io-outputvars 0.0329373 0 48 0.000686194
io-verify 0.384566 0 11 0.0349605

Standard deviation for the total time was 2.2s, and 0.15 for io

Here's next (with extra timers):

Timer name Total time (s) % of top Hits Mean time/hit (s)
run 97.1441 1 1 97.1441
rhs 74.0155 0.76 1555 0.0475984
invert 17.2017 0.18 1555 0.0110622
io 2.04389 0.02 33 0.0619361
io-ncxx4 1.64624 0.02 1960 0.000839917
comms 0.588739 0.01 12616 4.66661e-05
io-add 1.24521 0.01 153 0.00813861
io-outputvars 1.18999 0.01 5 0.237999
io-write 0.763503 0.01 22 0.0347047
io-open 0.0287595 0 2 0.0143797

Standard deviation for the total time was 7.7s, and 0.21 for io


So it looks like this branch is a bit slower, and the overheads are inside OptionsNetCDF::write rather than in the copying variables into the Options in outputVars.

Also, this was all done with debug builds. A quick look at optimised builds shows io dropping a little, but total run time is now ~10 s, so io fraction goes up a lot.

ZedThree avatar Jun 22 '21 16:06 ZedThree

This now has #2471 merged in, as I was testing the performance with it.

I think I've now eliminated the performance hit this PR was introducing -- it seems to have been due to re-creating the NcFile with every call to OptionsNetCDF::write (i.e. every timestep). Now, the NcFile is a member of OptionsNetCDF and the file is opened on demand and synced every timestep.

OptionsNetCDF::read still creates a new NcFile object (i.e. immediately opens a file on disk, reads the contents, and then closes it).

I think the interaction between reading and writing here should be fine, but is a potential cause for concern. It would probably only affect restart files, and we shouldn't have them open for both reading and writing at the same time, so probably fine.

ZedThree avatar Dec 06 '21 18:12 ZedThree

Fixed conflicts and an actual bug that clang-tidy caught. This is now ready to go in

ZedThree avatar Aug 31 '22 16:08 ZedThree

I'm merging this now so that we can get some actual experience with it in next before the imminent 5.0 release

ZedThree avatar Sep 14 '22 16:09 ZedThree