MITgcm icon indicating copy to clipboard operation
MITgcm copied to clipboard

Specify extra directories as runtime variables

Open owang01 opened this issue 3 years ago • 4 comments

What changes does this PR introduce?

New feature to allow extra directories provided as runtime parameters

What is the current behaviour?

By default, I/O of xx, adxx, other active files, and pkg/smooth related files is done in the run directory. However, in some case, using another directory/disk might be more desirable. As an example, see issue #535 about code performance for NASA Ames' Pleiades.

What is the new behaviour

Runtime parameters "ctrlDir" and "smoothDir" can be specified in data.ctrl and data.smooth. I/O of xx, adxx, other active files, and pkg/smooth related files will be conducted in the specified "ctrlDir" and "smoothDir".

Does this PR introduce a breaking change?

Should not, especially if "ctrlDir" and "smoothDir" are not specified.

Other information:

  • When "ctrlDir" and "smoothDir" are specified, symbolic links (or copies) to control weights and masking, smoothing scaling files etc. need to be created in "ctrlDir" and "smoothDir".
  • In addition to xx and adxx files, "ctrlDir" directory would also include all other active files. So "ctrlDir" could be renamed to "actfDir" or something similar.

Suggested addition to tag-index

o pkg/ctrl:

  • add run-time variable ctrlDir to specify a directory different from the default run directory for I/O of controls adjustment and its adjoint gradient (xx and adxx), and other active files.

o pkg/smooth:

  • add run-time variable smoothDir to specify a directory different from the default run directory for I/O of smoothing operators.

owang01 avatar Jul 16 '22 21:07 owang01

@owang01 I am a little confused about the AUTODIFF switch. I see the dilemma that sometimes active_read/write is used and sometimes read/write_rec ..., hence the confusing situation that you add "ctrlDir" when ALLOW_AUTODIFF is undefined. Would it make more sense to remove the changes in active_readwrite.F and instead always pass the complete file name (including ctrlDir)? I seems that this is much more work in different pkgs (small changes in many files), but in this way we'd have control over which packages use "ctrlDir", or I can even imagine having a "ctrlDir", an "eccoDir", "profilesDir", that could be set separately (or default to ctrlDir).

mjlosch avatar Jul 19 '22 07:07 mjlosch

Thanks for this PR @owang01, I think it will be useful and clean. I do agree with @mjlosch though, I think the implementation would be cleaner by explicitly setting the filename before passing to active_readwrite or any other read routine. This will keep the code more modular as he mentioned, and I think it will also just be easier to understand. For instance it was no problem to understand the smoothDir additions, but it took me a while to wrap my head around how ctrlDir works.

Changing the implementation in this way could involve changing a lot of filename definitions, so I think it will be much easier to do after #631.

timothyas avatar Jul 28 '22 15:07 timothyas

@mjlosch and @timothyas The latest commit now follows your suggestions: ctrlDir has to be added to filenames before calling active_file.F and active_file_ad.F. This is indeed a cleaner implementation. Another change is I now allow ctrlDir to be used in both forward and adjoint mode. Previously, ctrlDir can be actively used only when ALLOW_AUTODIFF is defined.

owang01 avatar Aug 17 '22 20:08 owang01

Thanks @owang01, I think this looks good. It is much easier to see what is going on just while looking through the code. I'll make some fairly detailed comments but otherwise I think it looks good.

timothyas avatar Aug 19 '22 16:08 timothyas

Thanks @owang01 for this PR, I think it is good to go! LGTM 🚀

timothyas avatar Oct 17 '22 19:10 timothyas

@timothyas Thank you for reviewing the PR and making the suggestions.

owang01 avatar Oct 17 '22 20:10 owang01

Another comment: Is this tested anywhere? Can we modify a test by just setting ctrlDir = './'?

mjlosch avatar Oct 18 '22 13:10 mjlosch

@owang01 I added ctrlDir = 'ctrl_variables' to lab_sea/input/data.ctrl, removed everything in lab_sea/run/ and reran testreport. The control variables are written to the new subdirectory 'ctrl_variables', but cannot be found for reading:

 link mitgcmuv_ad from dir ../build
 Divided Adjoint Run: add_DIVA_runs=4
STOP NORMAL END
 additional DIVA run # 1 : done
STOP NORMAL END
 additional DIVA run # 2 : done
STOP NORMAL END
 additional DIVA run # 3 : done
STOP NORMAL END
 additional DIVA run # 4 : done
Note: The following floating-point exceptions are signalling: IEEE_UNDERFLOW_FLAG
STOP ABNORMAL END: S/R MDS_READ_FIELD
 MDS_READ_FIELD: filename: adxx_atemp.0000000000 , adxx_atemp.0000000000.001.001.data
 MDS_READ_FIELD: Files DO not exist

So guess there's some code missing somewhere. I think it has to do with pkg/grdchk.

So we should definitely add a test of this functionality with more than just "./"

mjlosch avatar Oct 19 '22 08:10 mjlosch

@mjlosch Thank you for catching this. I made changes in pkg/grdchk. Now the lab_sea experiment will run after I add ctrlDir = 'ctrl_variables' in the namelist CTRL_NML in data.ctrl. Testreport reports "PASS"s for lab_sea, lab_sea.noseaice, and lab_sea.noseaicedyn. I will include this change to the lab_sea experiment in my new commit so this functionality can be tested.

owang01 avatar Oct 25 '22 20:10 owang01

Thanks @mjlosch for your review and approval. My new commit addresses your most recent suggestions.

owang01 avatar Oct 26 '22 15:10 owang01

@owang01 I am proposing to remove the new "go to" in 1 or 2 places. I will start with the new S/R ADD_ADPREFIX ; if you don't like it we will revert it, otherwise I will apply the same changes to the 1 or 2 other "go to" addition.

jm-c avatar Nov 09 '22 19:11 jm-c

Thanks @jm-c Sounds good. Look forward to seeing your changes.

owang01 avatar Nov 09 '22 20:11 owang01

I also simplify the writing of the output file-name, now done only once.

jm-c avatar Nov 09 '22 22:11 jm-c

@jm-c Looks good to me. But I'm not sure why some openad checks failed.

owang01 avatar Nov 11 '22 16:11 owang01

@owang01 it's possible (but not very likely) that these openad fail are due to my changes, but since my commit is the first one after adding the label "adjoint" to this PR, these 2 openad experiments were not tested before, so it's also possible that the problem was there before my changes. I will take a quick look in the coming days.

jm-c avatar Nov 11 '22 19:11 jm-c

Thanks @jm-c That makes sense.

owang01 avatar Nov 11 '22 20:11 owang01

I tried without my changes to active_file_ad.F, on engaging, and these 2 OpenAD exp also fail with similar error. I don't want to spent too much time to fix OpenAD, so will push a little change in pkg/openad that seems to work for me. Will see it this pass the CI test.

jm-c avatar Nov 12 '22 01:11 jm-c

@owang01 There is an other problem: the tangent-linear test for experiment lab_sea is broken (trying to read a file "g_./ctrl_variables/xx_theta.0000000000" that does not exist) , and this has to do with missing changes in pkg/autodiff/active_file_g.F. I tried to change this file the same way (or similar way) as you changed active_file_ad.F and it did work for me.

But instead of having 2 little S/R (add_adprefix & add_g_prefix), the first one to add prefix "ad" and the second one to add prefix "g_", we could have a single one with a third argument to set the prefix to use, like this one (except GitHub does not let me attach my Fortran src file) Let me know you think about it.

jm-c avatar Nov 12 '22 05:11 jm-c

@owang01 I just added my modified/new S/R with 3 arguments (but currently never called), just so that you can see it.

jm-c avatar Nov 12 '22 05:11 jm-c

@owang01 If you agree, I could push changes to a) use this S/R ADD_PREFIX instead of ADD_ADPREFIX and b) fix the Tang-Linear lab_sea test exp. And then after you could review these changes.

jm-c avatar Nov 15 '22 18:11 jm-c

@jm-c Thanks for making the changes. Looks good to me. Please push the changes.

owang01 avatar Nov 15 '22 19:11 owang01

@owang01 I will push an other set of changes to the - mostly unused - other active_file_*.F in pkg/autodiff, just to use the ADD_PREFIX S/R also there. The active_file_loc and active_file_gen don't seem to be used but if they were they would need also this little update.

jm-c avatar Nov 17 '22 05:11 jm-c

Since there was some unexpected (weird) compilation issues with g77 (now fixed), I also tried to run testreport -adm with few old compilers (open64 & pgi) as well as recent intel compiler (2021.4.0) and all pass. So all is good and will merge this PR later today (unless someone wants to add things).

jm-c avatar Nov 17 '22 20:11 jm-c