samurai
samurai copied to clipboard
[Question]: Examples with FOM or similar?
What do you want?
Hi! Do you have any examples or tutorial that could be run for a strong or weak scaling study, and have some kind of FOM (even if just running time)? We are doing a study on 4 to 64 nodes and looking for apps / proxy apps / benchmarks / synthetic benchmarks that could be candidates. The only requirement is that I can build it into a container and run it across nodes. Thanks!
Code of Conduct
- [x] I agree to follow this project's Code of Conduct
We are actively working on efficient samurai parallelization and hope to have working examples in June. You can run cases with MPI, but the performance will probably not be there when you enable mesh adaptation. The biggest challenge with these methods is the load balancing between subdomains, which has to happen regularly because of the dynamic adaptation. We are working on different methods for efficient load balancing: space filling curves or diffusion algorithm.
You can try the advection2d.cpp case in the demos/FiniteVolume directory without mesh refinement (min_level == max_level). We use CLI11 to manage various options in the command line. You can print them using the executable name following by -h. When we have finished the implementation of the load balancing, we can repeat the experiment with mesh adaptation. What do you think?
In any case, we're interested in your approach and setting up the procedure could be useful once everything is ready.
I've seen that other projects have been contacted. Could you tell us a little more about the purpose of this study?
I've seen that other projects have been contacted. Could you tell us a little more about the purpose of this study?
Sure! We are running a study on Google Cloud H3 instances, and I don't want to call it a performance study because it's far from a classical performance study, but we are running apps from sizes 4 to 64, 128, or 256 (depending on the scaling result) for strong or weak scaling. What we want to highlight for the work is the deployment and portability of the apps - each is deployed with Flux (an HPC job scheduler, the system scheduler on El Capitan via helm charts, and you can see the set here:
https://github.com/converged-computing/flux-apps-helm/
So far I've done just under 30, and we are going to run out of credits at the end of the month, so I'm strategically trying to put together batches to run (for each app I need to understand it, containerize it, test it, and then deploy at scale). I created and manage the RSEPedia so I'm searching in there to find apps (and how I found you).
Although right right now I'm on a small road trip, but will pick up work Thursday evening!
I'm having a hard time building. I first tried installing dependencies directly with the system package manager, but hit that my gcc didn't support c++17. The conan install didn't work. So I tried setting up the development environment with mamba (and got farther there):
RUN wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh && \
chmod +x Miniforge3-Linux-x86_64.sh && \
bash Miniforge3-Linux-x86_64.sh -p /opt/miniconda -b
ENV PATH=/opt/miniconda/bin:$PATH
RUN mamba install -y samurai && \
mamba install -y cxx-compiler cmake [make] && \
mamba install -y petsc pkg-config && \
mamba install -y libboost-mpi libboost-devel libboost-headers 'hdf5=*=mpi*'
RUN git clone https://github.com/hpc-maths/samurai /opt/samurai && \
cd /opt/samurai && \
cmake . -B build -DCMAKE_BUILD_TYPE=Release -DBUILD_DEMOS=ON && \
cmake --build ./build --config Release
But then for the second to last cmake command:
/opt/samurai/include/samurai/io/hdf5.hpp:765:45: error: no matching function for call to 'HighFive::Selection::write_raw(samurai::ScalarField<samurai::MRMesh<samurai::MRConfig<2> >, long unsigned int>::value_type*&, HighFive::AtomicType<long unsigned int>, HighFive::PropertyList<HighFive::PropertyType::DATASET_XFER>&)'
765 | data_slice.write_raw(data_ptr, HighFive::AtomicType<typename Field::value_type>{}, xfer_props);
Do you have a Dockerfile already working, or can make a suggestion? Ideally I could build this alongside system software (and not need an isolated environment). Thanks!
In the conda directory, we have the MPI environment needed for samurai. But it seems that you already make the good choices for package installation. You also have to activate MPI support, which is not the default behavior. To do that, you have to add the following option in the cmake command line : -DWITH_MPI=ON.
You can take a look at the CI where the MPI support is tested : https://github.com/hpc-maths/samurai/blob/master/.github/workflows/ci.yml#L218-L306
We also have a spack package for samurai : https://packages.spack.io/package.html?name=samurai This is not the last version, but we will update it today. You can have Dockerfile for free with spack as explain here: https://spack.readthedocs.io/en/latest/containers.html
Hope this help !
That built the software OK, but then to compile all the examples failed.
Scrolling up to the top of my terminal, I don't see any red so it's not clear what failed. Can you direct me at a specific build command (akin to one in the CI yaml you shared) that would be best fit for the scaling study? Likely I can build one of the demos. I know you mentioned:
You can try the advection2d.cpp case in the demos/FiniteVolume directory without mesh refinement
Can you show me how to build and run that? Apoloigies for having to ask - I'm not a pro with cmake!
Here is the full updated Dockerfile. The base has our HPC workload manager Flux.
ARG base=ghcr.io/converged-computing/flux-openmpi:ubuntu2204
FROM ${base}
WORKDIR /opt
RUN wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh && \
chmod +x Miniforge3-Linux-x86_64.sh && \
bash Miniforge3-Linux-x86_64.sh -p /opt/miniconda -b
ENV PATH=/opt/miniconda/bin:$PATH
COPY ./mpi-environment.yaml ./mpi-environment.yaml
RUN mamba env create --file ./mpi-environment.yaml
&& \
mamba shell init --shell bash && \
. ~/.bashrc && \
mamba activate samurai-env && \
mamba install -y mpich petsc pkg-config cxx-compiler
RUN git clone https://github.com/hpc-maths/samurai /opt/samurai && \
cd /opt/samurai && \
cmake . -Bbuild -GNinja \
-DCMAKE_BUILD_TYPE=Release \
-DWITH_MPI=ON \
-DBUILD_DEMOS=ON \
-DBUILD_TESTS=ON && \
cmake --build ./build --config Release
The last command above is what failed.
I realize that we're only compiling two examples with MPI:
https://github.com/hpc-maths/samurai/blob/master/.github/workflows/ci.yml#L272-L273
So I suggest to start with
cmake --build build --target finite-volume-advection-2d
and confirm that it works. I know that other examples work, as we work on their load balancing. But, since we haven't tried to compile all the executables of samurai, maybe we have some issues. Sorry about that. I will have a look and fix it.
When I build that example, I get this error again:
For the Dockerfile above, everything is the same except for the last line is changed to the one you provided. Do I have an issue with a dependency version or similar? It seems this has the incorrect number of arguments:
/opt/miniconda/envs/samurai-env/include/highfive/bits/H5Slice_traits.hpp:122:10: note: template argument deduction/substitution failed:
/opt/samurai/include/samurai/io/hdf5.hpp:765:45: note: candidate expects 2 arguments, 3 provided
765 | data_slice.write_raw(data_ptr, HighFive::AtomicType<typename Field::value_type>{}, xfer_props);
Could you add the command mamba list after enabling the environment ?
I will check the versions to see if anything is wrong.
Good idea!
(samurai-env) root@2f26b57dd5c7:/opt/samurai# mamba list
List of packages in environment: "/opt/miniconda/envs/samurai-env"
Name Version Build Channel
────────────────────────────────────────────────────────────────────────────────
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 3_kmp_llvm conda-forge
_x86_64-microarch-level 4 2_x86_64_v4 conda-forge
attr 2.5.1 h166bdaf_1 conda-forge
binutils 2.43 h4852527_4 conda-forge
binutils_impl_linux-64 2.43 h4bf12b8_4 conda-forge
binutils_linux-64 2.43 h4852527_4 conda-forge
bzip2 1.0.8 h4bc722e_7 conda-forge
c-ares 1.34.5 hb9d3cd8_0 conda-forge
c-compiler 1.9.0 h2b85faf_0 conda-forge
ca-certificates 2025.4.26 hbd8a1cb_0 conda-forge
cached-property 1.5.2 hd8ed1ab_1 conda-forge
cached_property 1.5.2 pyha770c72_1 conda-forge
cli11 2.4.2 h5888daf_0 conda-forge
cmake 4.0.2 h74e3db0_0 conda-forge
colorama 0.4.6 pyhd8ed1ab_1 conda-forge
cxx-compiler 1.9.0 h1a2810e_0 conda-forge
cxxopts 3.2.1 h74c10a1_1 conda-forge
exceptiongroup 1.2.2 pyhd8ed1ab_1 conda-forge
fftw 3.3.10 mpi_mpich_hbcf76dd_10 conda-forge
fmt 11.1.4 h07f6e7f_1 conda-forge
gcc 13.3.0 h9576a4e_2 conda-forge
gcc_impl_linux-64 13.3.0 h1e990d8_2 conda-forge
gcc_linux-64 13.3.0 hc28eda2_10 conda-forge
gxx 13.3.0 h9576a4e_2 conda-forge
gxx_impl_linux-64 13.3.0 hae580e1_2 conda-forge
gxx_linux-64 13.3.0 h6834431_10 conda-forge
h5py 3.13.0 nompi_py313hfaf8fd4_101 conda-forge
hdf5 1.14.6 mpi_mpich_h7f58efa_1 conda-forge
highfive 2.3.1 h4bd325d_0 conda-forge
hypre 2.32.0 mpi_mpich_h2e71eac_1 conda-forge
icu 75.1 he02047a_0 conda-forge
iniconfig 2.0.0 pyhd8ed1ab_1 conda-forge
kernel-headers_linux-64 3.10.0 he073ed8_18 conda-forge
keyutils 1.6.1 h166bdaf_0 conda-forge
krb5 1.21.3 h659f571_0 conda-forge
ld_impl_linux-64 2.43 h712a8e2_4 conda-forge
libaec 1.1.3 h59595ed_0 conda-forge
libamd 3.3.3 haaf9dc3_7100102 conda-forge
libblas 3.9.0 31_h59b9bed_openblas conda-forge
libboost 1.85.0 h0ccab89_4 conda-forge
libboost-devel 1.85.0 h00ab1b0_4 conda-forge
libboost-headers 1.85.0 ha770c72_4 conda-forge
libboost-mpi 1.85.0 h750f1fb_3 conda-forge
libbtf 2.3.2 h32481e8_7100102 conda-forge
libcamd 3.3.3 h32481e8_7100102 conda-forge
libcap 2.75 h39aace5_0 conda-forge
libcblas 3.9.0 31_he106b2a_openblas conda-forge
libccolamd 3.3.4 h32481e8_7100102 conda-forge
libcholmod 5.3.1 h59ddab4_7100102 conda-forge
libcolamd 3.3.4 h32481e8_7100102 conda-forge
libcurl 8.13.0 h332b0f4_0 conda-forge
libedit 3.1.20250104 pl5321h7949ede_0 conda-forge
libev 4.33 hd590300_2 conda-forge
libevent 2.1.12 hf998b51_1 conda-forge
libexpat 2.7.0 h5888daf_0 conda-forge
libfabric 2.1.0 ha770c72_1 conda-forge
libfabric1 2.1.0 hf45584d_1 conda-forge
libffi 3.4.6 h2dba641_1 conda-forge
libgcc 15.1.0 h767d61c_2 conda-forge
libgcc-devel_linux-64 13.3.0 hc03c837_102 conda-forge
libgcc-ng 15.1.0 h69a702a_2 conda-forge
libgcrypt-lib 1.11.0 hb9d3cd8_2 conda-forge
libgfortran 15.1.0 h69a702a_2 conda-forge
libgfortran-ng 15.1.0 h69a702a_2 conda-forge
libgfortran5 15.1.0 hcea5267_2 conda-forge
libgomp 15.1.0 h767d61c_2 conda-forge
libgpg-error 1.55 h3f2d84a_0 conda-forge
libhwloc 2.11.2 default_h0d58e46_1001 conda-forge
libiconv 1.18 h4ce23a2_1 conda-forge
libklu 2.3.5 hf24d653_7100102 conda-forge
liblapack 3.9.0 31_h7ac8fdf_openblas conda-forge
liblzma 5.8.1 hb9d3cd8_1 conda-forge
liblzma-devel 5.8.1 hb9d3cd8_1 conda-forge
libmpdec 4.0.0 h4bc722e_0 conda-forge
libnghttp2 1.64.0 h161d5f1_0 conda-forge
libnl 3.11.0 hb9d3cd8_0 conda-forge
libopenblas 0.3.29 openmp_hd680484_0 conda-forge
libpmix 5.0.7 h658e747_0 conda-forge
libptscotch 7.0.6 h4c3caac_1 conda-forge
libsanitizer 13.3.0 he8ea267_2 conda-forge
libscotch 7.0.6 hea33c07_1 conda-forge
libspqr 4.3.4 h852d39f_7100102 conda-forge
libsqlite 3.49.2 hee588c1_0 conda-forge
libssh2 1.11.1 hcf80075_0 conda-forge
libstdcxx 15.1.0 h8f9b012_2 conda-forge
libstdcxx-devel_linux-64 13.3.0 hc03c837_102 conda-forge
libstdcxx-ng 15.1.0 h4852527_2 conda-forge
libsuitesparseconfig 7.10.1 h92d6892_7100102 conda-forge
libsystemd0 257.4 h4e0b6ca_1 conda-forge
libudev1 257.4 hbe16f8c_1 conda-forge
libumfpack 6.3.5 heb53515_7100102 conda-forge
libuuid 2.38.1 h0b41bf4_0 conda-forge
libuv 1.50.0 hb9d3cd8_0 conda-forge
libxml2 2.13.8 h4bc477f_0 conda-forge
libzlib 1.3.1 hb9d3cd8_2 conda-forge
llvm-openmp 20.1.4 h024ca30_0 conda-forge
lz4-c 1.10.0 h5888daf_1 conda-forge
metis 5.1.0 hd0bcaf9_1007 conda-forge
mpi 1.0.1 mpich conda-forge
mpich 4.3.0 h1a8bee6_100 conda-forge
mumps-include 5.7.3 h23d43cc_10 conda-forge
mumps-mpi 5.7.3 h8c07e11_10 conda-forge
ncurses 6.5 h2d0b736_3 conda-forge
ninja 1.12.1 hff21bea_1 conda-forge
numpy 2.2.5 py313h17eae1a_0 conda-forge
openssl 3.5.0 h7b32b05_1 conda-forge
packaging 25.0 pyh29332c3_1 conda-forge
parmetis 4.0.3 hc7bef4e_1007 conda-forge
petsc 3.23.1 real_hf9cfe27_0 conda-forge
pip 25.1.1 pyh145f28c_0 conda-forge
pkg-config 0.29.2 h4bc722e_1009 conda-forge
pluggy 1.5.0 pyhd8ed1ab_1 conda-forge
pugixml 1.15 h3f63f65_0 conda-forge
pytest 8.3.5 pyhd8ed1ab_0 conda-forge
python 3.13.3 hf636f53_101_cp313 conda-forge
python_abi 3.13 7_cp313 conda-forge
rdma-core 57.0 h5888daf_0 conda-forge
readline 8.2 h8c095d6_2 conda-forge
rhash 1.4.5 hb9d3cd8_0 conda-forge
scalapack 2.2.0 h7e29ba8_4 conda-forge
superlu 7.0.1 h8f6e6c4_0 conda-forge
superlu_dist 9.1.0 h0804ebd_0 conda-forge
sysroot_linux-64 2.17 h0157908_18 conda-forge
tk 8.6.13 noxft_h4845f30_101 conda-forge
tomli 2.2.1 pyhd8ed1ab_1 conda-forge
tzdata 2025b h78e105d_0 conda-forge
ucc 1.3.0 had72a48_5 conda-forge
ucx 1.18.0 h1369271_4 conda-forge
xtensor 0.25.0 h00ab1b0_0 conda-forge
xtl 0.7.7 h00ab1b0_0 conda-forge
xz 5.8.1 hbcc6ac9_1 conda-forge
xz-gpl-tools 5.8.1 hbcc6ac9_1 conda-forge
xz-tools 5.8.1 hb9d3cd8_1 conda-forge
yaml 0.2.5 h7f98852_2 conda-forge
zstd 1.5.7 hb8e6e7a_2 conda-forge
The highfive version is too old. The correct version is 2.10. I don't know how this could have happened ! The versions of the others dependencies seem good.
I tried updating, but it seems that somehow triggered hdf5 to not be parallel?
That is the initial cmake command (which was working before). I'm not sure what to try next.
I tried and here is what works for me
mpi-environment.yaml
name: samurai-env
channels:
- conda-forge
dependencies:
- cmake
- ninja
- xtensor<0.26
- highfive>=2.10
- fmt
- pugixml
- cxxopts
- cli11<2.5
- pytest
- h5py
- openmpi
- libboost-mpi<1.87 # MPI seems broken with 1.87 with the error : symbol not found in flat namespace '_PyBaseObject_Type'
- libboost-devel
- libboost-headers
- hdf5=*=mpi*
Dockerfile
ARG base=ghcr.io/converged-computing/flux-openmpi:ubuntu2204
FROM ${base}
WORKDIR /opt
RUN wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh && \
chmod +x Miniforge3-Linux-x86_64.sh && \
bash Miniforge3-Linux-x86_64.sh -p /opt/miniconda -b
ENV PATH=/opt/miniconda/bin:$PATH
COPY ./mpi-environment.yaml ./mpi-environment.yaml
RUN mamba env create --file ./mpi-environment.yaml
SHELL ["conda", "run", "-n", "samurai-env", "/bin/bash", "-c"]
RUN mamba install -y cxx-compiler
RUN git clone https://github.com/hpc-maths/samurai /opt/samurai && \
cd /opt/samurai && \
cmake . -Bbuild -GNinja \
-DCMAKE_BUILD_TYPE=Release \
-DWITH_MPI=ON \
-DBUILD_DEMOS=ON && \
cmake --build ./build --config Release --target finite-volume-advection-2d
Hope this will help !
Thank you! I should be able to test again this week! I'm epically flailing with ebpf in containers at the moment. 😆
ok - container is built and finite volume is working in a test environment!
Here are the options I see:
How should I run this starting at 4 nodes up to (likely) 64? What kind of scaling, what parameters to set (and how to change, if needed, aside from the nodes and tasks) and what is the FOM?
I'm off to bed but can run these tomorrow.
Nice !
The first thing you can do is to try to run the example without the adaptive mesh. To do this, you have to set min_level=max_level. If you are doing a strong scaling measure, you can start with a max_leve=min_level=14.
For the weak scaling, you need to increase the number of levels depending on the number of subdomains. The domain is divided in the y-direction and the number of cells in each direction is $2^{level}$.
Do you have an example with strong scaling? Or just use min and max == 14 and give more resources?
I would basically do:
flux run -N4 -n352 ./finite-volume-advection-2d --min-level=14 --max-level=14
flux run -N8 -n704 ./finite-volume-advection-2d --min-level=14 --max-level=14
...
Up to the largest size (likely 64). Is the FOM just the duration?
Yes. I think it is enough. Maybe, you should remove the I/O, because otherwise you will write a big file and most of the time will be consumed by this task.
To remove the I/O, you just have to comment the lines where you have the save function called in the main function.