MFEM-MOOSE Intro doc
Closes #30635
@alexanderianblair See what you think and feel free to change/remove/expand anything!
I wonder, can we not use the !listing syntax to avoid explicitly copying the different sections of diffusion.i here?
I didn't understand you'd be working on something like this, so in #30637 I was adding some related docs, might make sense to link them together.
Job Documentation, step Docs: sync website on 97ec5da wanted to post the following:
View the site here
This comment will be updated on new commits.
I wonder, can we not use the
!listingsyntax to avoid explicitly copying the different sections ofdiffusion.ihere? I didn't understand you'd be working on something like this, so in #30637 I was adding some related docs, might make sense to link them together.
Yep, I think that's a good idea, @nmnobre! I'll change the copied blocks for listings
Thanks for adding this guide -- it's helpful.
I tried to follow these instructions and took some notes along the way:
- conduit and mfem build scripts don't support
--fast, which is kinda a moose standard to skipping the cmake configure step - I need to install a few python packages to get conduit documentation (do we need that?) to build
pip install sphinx sphinxcontrib-jquery sphinx-rtd-theme - The documentation says
--with-mpimight be needed, but I am not sure where exactly it's needed. I didn't use that flag and things seem to work fine. - The documentation says "If
MFEM-MOOSEhas been built with GPU offloading capabilities, here it is possible to set/Executioner/devicetocudaorhipto make use of GPU acceleration.", but I (or user) don't know if mfem has been built with cuda or hip... - I could run the diffusion problem with and without mpi on cpu after following these instructions, which is cool. But when I change the device to cuda, I get an MPI_Abort without any useful error message.
I think overall this documentation is pretty easy to follow!
Even though I closed my seemingly duplicate issue, I think I'd still like to see what I suggested there: https://github.com/idaholab/moose/issues/30697 namely a "install mfem" section under "optional packages".
Right now in this PR, there is a single markdown page. I think it's trying to do two things at the same time
- help people install mfem
- teach people at a high-level how mfem works in moose
I think the second part could be separated out as another page. The install page can link to that page in the end.
Thank you for the feedback @hugary1995, this is very useful!
Even though I closed my seemingly duplicate issue, I think I'd still like to see what I suggested there: #30697 namely a "install mfem" section under "optional packages".
Right now in this PR, there is a single markdown page. I think it's trying to do two things at the same time
- help people install mfem
- teach people at a high-level how mfem works in moose
I think the second part could be separated out as another page. The install page can link to that page in the end.
I agree that the tutorial and the install should be different pages. I'll split them up and write more detailed install instructions as soon as I can, and I'll try to address the other points you've made regarding the install options.
- The documentation says "If
MFEM-MOOSEhas been built with GPU offloading capabilities, here it is possible to set/Executioner/devicetocudaorhipto make use of GPU acceleration.", but I (or user) don't know if mfem has been built with cuda or hip...- I could run the diffusion problem with and without mpi on cpu after following these instructions, which is cool. But when I change the device to cuda, I get an MPI_Abort without any useful error message.
Yeah, if you've run the build scripts without any additional flags, then that will result in a CPU build, which is why the run is failing if you try to set the device as cuda. When I write the install page in more detail, I'll go over how to build for GPUs. The MPI_Abort error you're getting is what happens when MFEM hits an mfem_error(). Unfortunately, unless MFEM has been built in debug mode, the mfem_error() doesn't display the error message that accompanies it, which I agree is unhelpful. The best way to solve this is building MFEM in debug mode, but if you don't want to go through this hassle, one way of checking where the error is coming from is running the code that's failing in gdb and adding a breakpoint at the mfem_error function. Then, the backtrace will tell you where the error happened and why.
Still, in this case, there's no need to check because I'm pretty sure your error is happening due to trying to do a cuda run on a CPU build.
Sounds good. Yeah, some more explanation on different build options would be very helpful.
I understand how to set breakpoint and get backtrace etc. The point I was trying make is that the end users will only get a non-debug build in most of the times, and so some error message rather than a hard termination would always be helpful. Perhaps this is something for future improvement. One easy way of fixing this would be to check Executioner/device at construction or setup time, and issue a paramError if mfem isn't built with the correct flags.
Indeed. Unfortunately, we can't get rid of all possible instances of this MPI_Abort because the mfem_error often comes from underlying MFEM methods. We do try as much as possible to substitute MFEM error checks for Moose ones since Moose errors do come with a message attached. I'll see if I can add a Moose check in our setDevice() method so that it throws a message if you try to do a gpu run on a cpu build.
@hugary1995 I've split the document into an MFEM-MOOSE tutorial and an install guide. The install guide now offers a little bit more detail, including how to build MFEM-MOOSE for GPUs.
I have also opened a PR (#30732) to fix the lack of a --fast option on the mfem build script.
Let me know if you encounter any issues!
hey how do I navigate to those pages? should there be a link there https://mooseframework.inl.gov/docs/PRs/30636/site/getting_started/examples_and_tutorials/index.html or there https://mooseframework.inl.gov/docs/PRs/30636/site/getting_started/installation/index.html
Yeah, see my comment above -- I made a suggestion in https://github.com/idaholab/moose/issues/30697. MFEM should be one of the optional packages.
I think now that you are adding an install_mfem.md page, the mfem_warning.md can just point to that page instead of maintaining a separate copy of the installation instructions.
I think now that you are adding an
install_mfem.mdpage, themfem_warning.mdcan just point to that page instead of maintaining a separate copy of the installation instructions.
That's a very good point, let me do this now