ufs-weather-model
ufs-weather-model copied to clipboard
Feature/detect frontera
Commit Queue Requirements:
- [x] Fill out all sections of this template.
- [ ] All sub component pull requests have been reviewed by their code managers.
- [ ] Run the full Intel+GNU RT suite (compared to current baselines) on either Hera/Derecho/Hercules
- [ ] Commit 'test_changes.list' from previous step
Description:
Commit Message:
* UFSWM - This only affect detect_machine.sh
Priority:
- Normal
Git Tracking
UFSWM:
- Closes # https://github.com/NOAA-EMC/global-workflow/issues/2570 (Note that I did not realize ufs-weather-model was the authoritative copy of this script when I created the issue)
Sub component Pull Requests:
- None
Changes
Regression Test Changes (Please commit test_changes.list):
- No Baseline Changes.
Input data Changes:
- None.
Library Changes/Upgrades:
Testing Log:
- RDHPCS
- [ ] Hera
- [ ] Orion
- [ ] Hercules
- [ ] Jet
- [ ] Gaea
- [ ] Derecho
- WCOSS2
- [ ] Dogwood/Cactus
- [ ] Acorn
- [ ] CI
- [ ] opnReqTest (complete task if unnecessary)
Letting @NOAA-EMC/teams/global-workflow-admins know so they can get a PR to bring in these changes as needed.
Thought that team links worked, but alas they don't. @aerorahul letting you know detect_machines.sh is getting a change.
@BrianCurtis-NOAA - I have another set of small changes in this same line. One is a minor update to modules-setup.sh, and the other is to add ufs_frontera.intel.lua to modulefiles. Would it make sense to just add those into this PR, or should I let this one close out and then open a new one?
You can keep making changes here, just let me know when you're done.
@benjamin-cash Are you also plaining to activate rt.sh on Frontera? I lost track of what I add to ufs-coastal but you could check it from there if you want. https://github.com/oceanmodeling/ufs-coastal. BTW, let me know if you need help. It would be nice to have frontera support in ufs-weather-model level. Thanks for doing it.
@uturuncoglu - If you have rt.sh working for Frontera I would love to get that into ufs-weather-model as well. I think that can be separated from this PR though, so I won't add anything more to this one. (@BrianCurtis-NOAA)
@benjamin-cash Yes, rt.sh is working on UFS coastal and we are running ufs-coastal specific RTs with it. I sync ufs-coastal couple of days ago with ufs-weather-model. So, if you look at the diff from here you might see those changes around rt.sh. https://github.com/ufs-community/ufs-weather-model/compare/develop...oceanmodeling:ufs-coastal:feature/coastal_app This also has changes related to the ufs-coastal like extra components etc.
@uturuncoglu - I tried running cpld_control_c192_p8 via the coastal rt.sh, and ran into errors. It looks like in default_var.sh the only variable added for frontera was TPN=56, and none of the other variables like INPES_dflt. Did I miss a step, or would those still need to be updated to run the rest of the tests?
@benjamin-cash yes. that needs to be extended. I have no experience about those numbers. maybe @BrianCurtis-NOAA could help about it
@benjamin-cash I just set a that number and it is working with coastal app but probably other RTs uses more platform specific parameters. If we could add others that would be great.
@uturuncoglu - Makes sense. I'm going to try just copying in the settings for Derecho and see how far it makes it.
@benjamin-cash Okay. If we could an other platform as much as close to Frontera, it would be a good starting point.
@uturuncoglu - That was enough to at least get the test started, but then it failed because it was looking for the wrong WW3_input_data_* directory, I'll have to track down where that is set.
@benjamin-cash probably it is pointing my input directory. Is there any place on Frontera that we could stage at least part of the UFS input data? Then maybe we could just put input files of coupled control p8 and point that one as disknm variable in rt.sh frontera part.
There are a couple of options for data. One is that we could store the data on Ranch (Frontera archive system), and then stage the data to $SCRATCH on Frontera and recopy as needed. Or someone who is working on UFS on the system but not storing a lot of data otherwise could keep the files in their $WORK space. Do you know what the data volume is?
@benjamin-cash The input folder on Derecho is around 275G /glade/derecho/scratch/epicufsrt/ufs-weather-model/RT/NEMSfv3gfs/input-data-20240501. I think this includes all the data. But, if it is too much maybe we could selectively copy just couple of folders to run major tests.
@uturuncoglu - At 275GB we can definitely find somewhere for that to sit. I don't think I have access to derecho at this point, could you globus that directory to $SCRATCH on Frontera and let me know where you put it? I can figure out a more permanent location for it from there.
@benjamin-cash Sure. Let me copy it over. I'll let you know when it is finished.
@benjamin-cash I copied files to /scratch1/01118/tg803972/RT/NEMSfv3gfs. Let me know if you have any issue to access it. I think you need to create develop-20240430 folder under this directory to run any RT. Then, maybe we could place baseline of couple of RT under develop-20240430. I am not sure running full test suite under Frontera is feasible or not at this point.
@uturuncoglu - This discussion has wandered pretty far afield from the PR, so I'm going to move the discussion of the rt files to email. :)
@BrianCurtis-NOAA - It looks like this is stuck waiting on reviews to come in (assuming updating didn't break anything just now), any chance you could help nudge this along?
@benjamin-cash and @uturuncoglu Just confirming this has been tested on Frontera and works as expected with the tests you are able to run?
If so, @jkbk2004 can make sure to get this combined with another PR as we don't need to worry about baselines (as far as I understand).
Hi @BrianCurtis-NOAA - The changes to detect_machine.sh is something I've used multiple times when I've downloaded the weather model so they should be good to go. The module changes have been somewhat overtaken by events - we now have spack-stack working via container on Frontera.
@benjamin-cash @BrianCurtis-NOAA I am testing in my end too. I'll update you soon.
@benjamin-cash I think there is an issue with ufs_frontera.intel.lua file. There are some html tag in it. So, probably it is corrupted.
Hi @uturuncoglu - Yikes, yeah, no idea how I managed to do that. Could you point me to the module file you have tested on Frontera and I will replace?
@benjamin-cash We are using following with UFS Coastal - https://github.com/oceanmodeling/ufs-weather-model/blob/feature/coastal_app/modulefiles/ufs_frontera.intel.lua but I think you need to change the paths for your installation.
I did not try to fix yours yet but if you want I could try.
@benjamin-cash BTW, it seems that you don't have any change in rt.sh side. Are you plaining to do it? UFS Coastal could still maintain its own changes related with the rt.sh.
When this pr is ready, we can combine with #2335 and #2278.
Hi @uturuncoglu - for this PR the module file was meant to be an exact copy of yours and to use the non-container version of spack-stack on Frontera. I hadn't made any changes to rt.sh in this PR, but maybe it would make sense to fold them in as well.