moose icon indicating copy to clipboard operation
moose copied to clipboard

MFEM-MOOSE Intro doc

Open Heinrich-BR opened this issue 8 months ago • 11 comments

Closes #30635

Heinrich-BR avatar Jun 02 '25 15:06 Heinrich-BR

@alexanderianblair See what you think and feel free to change/remove/expand anything!

Heinrich-BR avatar Jun 02 '25 15:06 Heinrich-BR

I wonder, can we not use the !listing syntax to avoid explicitly copying the different sections of diffusion.i here? I didn't understand you'd be working on something like this, so in #30637 I was adding some related docs, might make sense to link them together.

nmnobre avatar Jun 02 '25 16:06 nmnobre

Job Documentation, step Docs: sync website on 97ec5da wanted to post the following:

View the site here

This comment will be updated on new commits.

moosebuild avatar Jun 02 '25 22:06 moosebuild

I wonder, can we not use the !listing syntax to avoid explicitly copying the different sections of diffusion.i here? I didn't understand you'd be working on something like this, so in #30637 I was adding some related docs, might make sense to link them together.

Yep, I think that's a good idea, @nmnobre! I'll change the copied blocks for listings

Heinrich-BR avatar Jun 09 '25 08:06 Heinrich-BR

Thanks for adding this guide -- it's helpful.

I tried to follow these instructions and took some notes along the way:

  • conduit and mfem build scripts don't support --fast, which is kinda a moose standard to skipping the cmake configure step
  • I need to install a few python packages to get conduit documentation (do we need that?) to build pip install sphinx sphinxcontrib-jquery sphinx-rtd-theme
  • The documentation says --with-mpi might be needed, but I am not sure where exactly it's needed. I didn't use that flag and things seem to work fine.
  • The documentation says "If MFEM-MOOSE has been built with GPU offloading capabilities, here it is possible to set /Executioner/device to cuda or hip to make use of GPU acceleration.", but I (or user) don't know if mfem has been built with cuda or hip...
  • I could run the diffusion problem with and without mpi on cpu after following these instructions, which is cool. But when I change the device to cuda, I get an MPI_Abort without any useful error message.

I think overall this documentation is pretty easy to follow!

Even though I closed my seemingly duplicate issue, I think I'd still like to see what I suggested there: https://github.com/idaholab/moose/issues/30697 namely a "install mfem" section under "optional packages".

Right now in this PR, there is a single markdown page. I think it's trying to do two things at the same time

  1. help people install mfem
  2. teach people at a high-level how mfem works in moose

I think the second part could be separated out as another page. The install page can link to that page in the end.

hugary1995 avatar Jun 10 '25 20:06 hugary1995

Thank you for the feedback @hugary1995, this is very useful!

Even though I closed my seemingly duplicate issue, I think I'd still like to see what I suggested there: #30697 namely a "install mfem" section under "optional packages".

Right now in this PR, there is a single markdown page. I think it's trying to do two things at the same time

  1. help people install mfem
  2. teach people at a high-level how mfem works in moose

I think the second part could be separated out as another page. The install page can link to that page in the end.

I agree that the tutorial and the install should be different pages. I'll split them up and write more detailed install instructions as soon as I can, and I'll try to address the other points you've made regarding the install options.

  • The documentation says "If MFEM-MOOSE has been built with GPU offloading capabilities, here it is possible to set /Executioner/device to cuda or hip to make use of GPU acceleration.", but I (or user) don't know if mfem has been built with cuda or hip...
  • I could run the diffusion problem with and without mpi on cpu after following these instructions, which is cool. But when I change the device to cuda, I get an MPI_Abort without any useful error message.

Yeah, if you've run the build scripts without any additional flags, then that will result in a CPU build, which is why the run is failing if you try to set the device as cuda. When I write the install page in more detail, I'll go over how to build for GPUs. The MPI_Abort error you're getting is what happens when MFEM hits an mfem_error(). Unfortunately, unless MFEM has been built in debug mode, the mfem_error() doesn't display the error message that accompanies it, which I agree is unhelpful. The best way to solve this is building MFEM in debug mode, but if you don't want to go through this hassle, one way of checking where the error is coming from is running the code that's failing in gdb and adding a breakpoint at the mfem_error function. Then, the backtrace will tell you where the error happened and why.

Still, in this case, there's no need to check because I'm pretty sure your error is happening due to trying to do a cuda run on a CPU build.

Heinrich-BR avatar Jun 11 '25 11:06 Heinrich-BR

Sounds good. Yeah, some more explanation on different build options would be very helpful.

I understand how to set breakpoint and get backtrace etc. The point I was trying make is that the end users will only get a non-debug build in most of the times, and so some error message rather than a hard termination would always be helpful. Perhaps this is something for future improvement. One easy way of fixing this would be to check Executioner/device at construction or setup time, and issue a paramError if mfem isn't built with the correct flags.

hugary1995 avatar Jun 11 '25 12:06 hugary1995

Indeed. Unfortunately, we can't get rid of all possible instances of this MPI_Abort because the mfem_error often comes from underlying MFEM methods. We do try as much as possible to substitute MFEM error checks for Moose ones since Moose errors do come with a message attached. I'll see if I can add a Moose check in our setDevice() method so that it throws a message if you try to do a gpu run on a cpu build.

Heinrich-BR avatar Jun 11 '25 13:06 Heinrich-BR

@hugary1995 I've split the document into an MFEM-MOOSE tutorial and an install guide. The install guide now offers a little bit more detail, including how to build MFEM-MOOSE for GPUs. I have also opened a PR (#30732) to fix the lack of a --fast option on the mfem build script.

Let me know if you encounter any issues!

Heinrich-BR avatar Jun 12 '25 14:06 Heinrich-BR

hey how do I navigate to those pages? should there be a link there https://mooseframework.inl.gov/docs/PRs/30636/site/getting_started/examples_and_tutorials/index.html or there https://mooseframework.inl.gov/docs/PRs/30636/site/getting_started/installation/index.html

GiudGiud avatar Jun 19 '25 15:06 GiudGiud

Yeah, see my comment above -- I made a suggestion in https://github.com/idaholab/moose/issues/30697. MFEM should be one of the optional packages.

hugary1995 avatar Jun 19 '25 15:06 hugary1995

I think now that you are adding an install_mfem.md page, the mfem_warning.md can just point to that page instead of maintaining a separate copy of the installation instructions.

hugary1995 avatar Jun 27 '25 11:06 hugary1995

I think now that you are adding an install_mfem.md page, the mfem_warning.md can just point to that page instead of maintaining a separate copy of the installation instructions.

That's a very good point, let me do this now

Heinrich-BR avatar Jun 27 '25 12:06 Heinrich-BR