phaseField icon indicating copy to clipboard operation
phaseField copied to clipboard

Let users control the checkpoint save/load location

Open stvdwtt opened this issue 5 years ago • 9 comments

We've determined that users can control the output location using the output file name option in the parameters file. However, currently the checkpoint files must be saved and loaded from the working directory. Ideally, users would be able to choose where these files should be. In general, they may even want the checkpoint load directory to be different from the checkpoint save directory (e.g. if they are running multiple continuations of the same simulation).

Here are a couple options on how to deal with this:

  1. Add new entries in the parameters file for the load path and save path Pros: Max flexibility Cons: Breaks parallelism with the output name, since these would just be paths; Checkpoints aren't necessarily saved in the same location as the outputs

  2. Tie the checkpoint save path to the output path (and maybe the load path?) Pros: This is probably the most intuitive behavior Cons: Slight loss in flexibility; Would have to strip off the last part of the output path to get only the directory

Thoughts? @david-montiel-t @kubra

stvdwtt avatar Sep 24 '19 17:09 stvdwtt

Hello Steve,

I think I am for option 1. I like the idea of specifying the path separate from the file name.

To me the most natural thing would be to be able to set the following things separately (all optional):

  • Path of output files
  • Name (base) of output files
  • Path to load from checkpoints
  • Path to save checkpoints

The default option for the last two could be tied to the output files path.

On Tue, Sep 24, 2019, 1:39 PM Stephen DeWitt [email protected] wrote:

We've determined that users can control the output location using the output file name option in the parameters file. However, currently the checkpoint files must be saved and loaded from the working directory. Ideally, users would be able to choose where these files should be. In general, they may even want the checkpoint load directory to be different from the checkpoint save directory (e.g. if they are running multiple continuations of the same simulation).

Here are a couple options on how to deal with this:

Add new entries in the parameters file for the load path and save path Pros: Max flexibility Cons: Breaks parallelism with the output name, since these would just be paths; Checkpoints aren't necessarily saved in the same location as the outputs 2.

Tie the checkpoint save path to the output path (and maybe the load path?) Pros: This is probably the most intuitive behavior Cons: Slight loss in flexibility; Would have to strip off the last part of the output path to get only the directory

Thoughts? @david-montiel-t https://github.com/david-montiel-t @Kubra https://github.com/Kubra

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/prisms-center/phaseField/issues/146?email_source=notifications&email_token=AD3FA42S45HDGEB7K7SFJT3QLJGFFA5CNFSM4I2CYGG2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HNMJCMQ, or mute the thread https://github.com/notifications/unsubscribe-auth/AD3FA4YCX23KSRRK55QI3FDQLJGFFANCNFSM4I2CYGGQ .

david-montiel-t avatar Sep 24 '19 17:09 david-montiel-t

That plan sounds good to me. Kubra says she’s game to try to implement it. I’ll send her an outline of what’s required.

stvdwtt avatar Sep 25 '19 00:09 stvdwtt

Thank you, Steve!

Can you CC me on that?

On Tue, Sep 24, 2019, 8:37 PM Stephen DeWitt [email protected] wrote:

That plan sounds good to me. Kubra says she’s game to try to implement it. I’ll send her an outline of what’s required.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/prisms-center/phaseField/issues/146?email_source=notifications&email_token=AD3FA4ZOZ67OATAA2HMVQ63QLKXDPA5CNFSM4I2CYGG2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7QGVUA#issuecomment-534801104, or mute the thread https://github.com/notifications/unsubscribe-auth/AD3FA4Y5JW372UJRWCWDS6TQLKXDPANCNFSM4I2CYGGQ .

david-montiel-t avatar Sep 25 '19 00:09 david-montiel-t

@david-montiel-t On second thought, I’ll post it here on GitHub, so there’s a more permanent record. I’ll email both of you when I do in case you don’t get a notification.

stvdwtt avatar Sep 25 '19 01:09 stvdwtt

Great, thanks!

On Tue, Sep 24, 2019, 10:00 PM Stephen DeWitt [email protected] wrote:

@david-montiel-t https://github.com/david-montiel-t On second thought, I’ll post it here on GitHub, so there’s a more permanent record. I’ll email both of you when I do in case you don’t get a notification.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/prisms-center/phaseField/issues/146?email_source=notifications&email_token=AD3FA44YVQ42PZ7VUS2425LQLLAZBA5CNFSM4I2CYGG2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7QKWAQ#issuecomment-534817538, or mute the thread https://github.com/notifications/unsubscribe-auth/AD3FA4ZDZVODQNLTRU4XRSLQLLAZBANCNFSM4I2CYGGQ .

david-montiel-t avatar Sep 25 '19 02:09 david-montiel-t

How to make this change (big picture):

  1. Update the entries that are available in the parameter.in file
  2. Add the data from those entries to userInputParameters
  3. Edit the output/checkpoint functions to use the new paths

More detailed instructions: (Kubra, since this your first time adding something to PRISMS-PF, I'm going into more detail than I normally would.)

  1. Get a GitHub account :), fork the PRISMS-PF repo, and clone from your fork.

  2. Update the entries that are available in the parameter.in file

  • Go to the declare_parameters method in https://github.com/prisms-center/phaseField/blob/master/src/inputFileReader/inputFileReader.cc. Add a new entry for the output path in the vicinity of line 317. Add new entries for the save path and load path for checkpoint files in the vicinity of line 333.
  1. Add the data from those entries to userInputParameters
  • Go to https://github.com/prisms-center/phaseField/blob/master/include/userInputParameters.h. In the vicinity of line 77, add a variable to store the output path. In the vicinity of 119 add variables for the two checkpoint paths.
  • Go to https://github.com/prisms-center/phaseField/blob/master/src/userInputParameters/userInputParameters.cc. In the vicinity of line 209, add a line to assign the output path to the variable that you just created. In the vicinity of line 304, add lines to assign the checkpoint paths to the variables you just created.
  1. Edit the output/checkpoint functions to use the new paths
  • Go to https://github.com/prisms-center/phaseField/blob/master/src/matrixfree/outputResults.cc. Add the new path variable in front of the output name in the vicinity of lines 104-109. Add the new path variable in front of the "integratedFields.txt" output file name in the vicinity of lines 48-75.
  • Go to https://github.com/prisms-center/phaseField/blob/master/src/matrixfree/checkpoint.cc. Add the appropriate path variables in front of the file names in lines 24-26, 66, 72, 81, 89, 106-107, 113, 168, and 171.
  1. Write tests in the unit test suite to confirm that everything works as expected.
  • The idea here is that we want to have a way of knowing that this new feature works. You can do some informal tests on your own to make sure that it works, but it's better for everyone if you formally include those in the repo.
  • Kubra -- normally the person who adds a feature is responsible for writing the tests. You can write the tests at the same time as you modify the code and see what works and what doesn't. If you're up for writing the tests for this, great. If not, no worries. Especially since this is your first contribution, I want it to be as painless as possible. @david-montiel-t, could you take this on if Kubra doesn't? To my knowledge you haven't done much with the unit test system and this could be a good introduction.

@david-montiel-t and Kubra: if you have any questions about this, let me know. There are no stupid questions.

stvdwtt avatar Sep 26 '19 15:09 stvdwtt

Thank for all the detailed information, Steve!

I think it should be fairly straightforward.

Kubra, let me know if you run into any issues.

David

On Thu, Sep 26, 2019, 11:35 AM Stephen DeWitt [email protected] wrote:

How to make this change (big picture):

  1. Update the entries that are available in the parameter.in file
  2. Add the data from those entries to userInputParameters
  3. Edit the output/checkpoint functions to use the new paths

More detailed instructions: (Kubra, since this your first time adding something to PRISMS-PF, I'm going into more detail than I normally would.)

Get a GitHub account :), fork the PRISMS-PF repo, and clone from your fork. 2.

Update the entries that are available in the parameter.in file

  • Go to the declare_parameters method in https://github.com/prisms-center/phaseField/blob/master/src/inputFileReader/inputFileReader.cc. Add a new entry for the output path in the vicinity of line 317. Add new entries for the save path and load path for checkpoint files in the vicinity of line 333.
  1. Add the data from those entries to userInputParameters
  • Go to https://github.com/prisms-center/phaseField/blob/master/include/userInputParameters.h. In the vicinity of line 77, add a variable to store the output path. In the vicinity of 119 add variables for the two checkpoint paths.
  • Go to https://github.com/prisms-center/phaseField/blob/master/src/userInputParameters/userInputParameters.cc. In the vicinity of line 209, add a line to assign the output path to the variable that you just created. In the vicinity of line 304, add lines to assign the checkpoint paths to the variables you just created.
  1. Edit the output/checkpoint functions to use the new paths
  • Go to https://github.com/prisms-center/phaseField/blob/master/src/matrixfree/outputResults.cc. Add the new path variable in front of the output name in the vicinity of lines 104-109. Add the new path variable in front of the "integratedFields.txt" output file name in the vicinity of lines 48-75.
  • Go to https://github.com/prisms-center/phaseField/blob/master/src/matrixfree/checkpoint.cc. Add the appropriate path variables in front of the file names in lines 24-26, 66, 72, 81, 89, 106-107, 113, 168, and 171.
  1. Write tests in the unit test suite to confirm that everything works as expected.
  • The idea here is that we want to have a way of knowing that this new feature works. You can do some informal tests on your own to make sure that it works, but it's better for everyone if you formally include those in the repo.
  • Kubra -- normally the person who adds a feature is responsible for writing the tests. You can write the tests at the same time as you modify the code and see what works and what doesn't. If you're up for writing the tests for this, great. If not, no worries. Especially since this is your first contribution, I want it to be as painless as possible. @david-montiel-t https://github.com/david-montiel-t, could you take this on if Kubra doesn't? To my knowledge you haven't done much with the unit test system and this could be a good introduction.

@david-montiel-t https://github.com/david-montiel-t and Kubra: if you have any questions about this, let me know. There are no stupid questions.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/prisms-center/phaseField/issues/146?email_source=notifications&email_token=AD3FA46IRSQE67FRB2ECZKTQLTJD7A5CNFSM4I2CYGG2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7WAIGA#issuecomment-535561240, or mute the thread https://github.com/notifications/unsubscribe-auth/AD3FA472N52I3JJQPKMUZF3QLTJD7ANCNFSM4I2CYGGQ .

david-montiel-t avatar Sep 26 '19 15:09 david-montiel-t

Hi Steve,

Thanks a lot for the detailed instructions, much appreciated!

David,

Let me give it a try first, and I will get in touch if I have questions.

ps: I've just noticed that I already have a GitHub account which was never used :)

-Kubra

kkarayagiz avatar Sep 27 '19 03:09 kkarayagiz

@kkarayagiz To answer your question from the email, here's the big picture for the unit tests in PRISMS-PF. (Eventually this should be in the manual.)

The test directory includes two sections, the automatic tests and the unit tests. The unit tests are a collection of tests that verify that individual small pieces of the code base work as expected. I think there are around 25 tests right now (preferably there would be many more). The automated tests directory contains a Python script that runs the unit tests and also runs regression tests on a handful of the pre-built apps. The idea is that one can run that Python script to confirm that PRISMS-PF is working properly. If something has changed so that the code no longer works, hopefully it will show up as a failed unit test or a failed regression test (that the final result of the calculations for the selected apps has changed from the last time the tests were run).

Now for the structure of the unit tests. First off, the unit test code is a mess. I apologize for that. I had hoped to fix the structure before I left UM but I just didn't have time. Up to now, I'm really the only one who has used the unit tests (that I know of), so I haven't added much documentation.

If you open up the unit_tests directory you'll see a main.cc file. That file calls all of the tests and receives whether or not each test passes. It tallies up the total number of tests as well as the number that pass and it prints updates along the way. If you want to run the unit tests yourself you go to that directory, do "cmake .", then "make debug", then "mpirun -n N main" just like you would for a PRISMS-PF app.

Each test in the unit tests checks either a function or class. Each test is a member of the "unitTest" class is is named "[function/class name]tester". The unitTest class is declared in unitTest.h. It has a member function for each unit test and uses the naming convention "test[function/class name]". Each of those test functions is defined in a file of the same name. Each test function defines some input for the function/class and checks to make sure that the output is the expected output. Most tests have a number of subtests that check various aspects of the function/class. The test function prints out the result of each subtest as well as the test as a whole. Finally it returns the pass/fail result back out to the main function.

A pretty simple example is the "test_computeStress" function. It defines some elastic constants and a strain tensor and then confirms that the "computeStress" function calculates the correct stress. A slightly more complicated example is the "test_LinearSolverParameters" test. There it is testing a whole class. It creates an instance of the class, loads in some parameters and then checks to make sure that one can retrieve those parameters from the object. The most complicated tests are those that test a single function in a class (usually MatrixFreePDE because it is too big to test all at once). If you look at the "test_InvM" function, you'll see that it defines a testInvM class at the top of the file. This class inherits from MatrixFreePDE. The constructor of the class is where the actual test is performed and the output of the test is added as a new member variable. The test function itself just creates a testInvM object and checks to see if the result equals the expected result.

In your case, we want to test a few things. First, can inputFileReader parse the new entries in the parameters file? Second, do the path variables get properly loaded into userInputParameters? Third, do the output files get saved to the specified path? Fourth, do the checkpoint files get saved to the specified paths? Fifth, do the checkpoint files get loaded from the specified paths?

Some of these are more difficult to deal with than others. For the first thing on the list you might not have to write an actual test. If you look at the main.cc file in the unit tests, you'll see that one line 23 we create an inputFileReader object and parse the parameters_test.in file. If you include the new entries in that parameters file then the unit test will fail if parsing fails. (Now, going forward, we might want to turn that into an actual test so that it cleanly fails, but that's probably out of the scope of this.) Next, you'll see on line 24 that we load the inputs into a userInputParameters object. Right now, we just use that for inputs to the other tests. However, what you can do is to create a test (put it at the top) that checks userInputs to make sure that the paths are correct. Later on you or someone else could expand that test to include other aspects of userInputParameters. The tests for the last three items on the list would just be more conventional unit tests like some of the others. You should write tests for the checkpoint and output functions to confirm that they use the correct paths. Eventually the tests could be expanded to cover other aspects of those functions.

Modifying the unit tests to include the first two items should be fairly easy. The other three could be a bit more complicated. I'd recommend trying them, but it's ok if you can't get them to work. David and I can work to finish them.

One thing I want to recommend is the idea of "test-driven development". There the idea is that you write the tests before or while you make the changes. This does a couple things. First, it forces you to actually write the tests. It's easy to plan to write a test after you get done adding the feature, but once you have something that works there's a temptation to move on to other things without writing the test. Second, it allows you to use the test to see if the code you're writing is working. If you write the test first, you'll see that initially the test fails (because you haven't added the feature yet). As you start to add the feature you should see some of the subtests start to pass. Finally, when you are done adding the feature you can see that all of the tests pass. Doing it this way, you can have confidence in your new functionality before you even run your first "real" app with it.

I hope this helps, let me know if anything is unclear.

stvdwtt avatar Sep 27 '19 17:09 stvdwtt