dolfinx icon indicating copy to clipboard operation
dolfinx copied to clipboard

Mesh reader/writer with ADIOS2

Open jorgensd opened this issue 2 years ago • 7 comments

This is the first step towards having function check-pointing in DOLFINx.

The code uses ADIOS2 as backend for reading and writing data. No MPI communication required in either stage (until mesh creation/partitioning).

The following C++ code runs nicely


#include <dolfinx/common/MPI.h>
#include <dolfinx/io/ADIOS2Writers.h>
#include <dolfinx/mesh/Mesh.h>
#include <dolfinx/mesh/generation.h>
#include <filesystem>
#include <mpi.h>

using namespace dolfinx;

int main(int argc, char* argv[])
{
  dolfinx::init_logging(argc, argv);
  MPI_Init(&argc, &argv);

  {
    auto mesh
        = std::make_shared<mesh::Mesh<float>>(mesh::create_rectangle<float>(
            MPI_COMM_WORLD, {{{0.0, 0.0}, {1.0, 1.0}}}, {2, 3},
            mesh::CellType::triangle,
            mesh::create_cell_partitioner(mesh::GhostMode::none)));

#ifdef HAS_ADIOS2
    std::filesystem::__cxx11::path filename = "test";
    io::Checkpointer<float> outfile(mesh->comm(),
                                    filename.replace_extension("bp"), mesh);
    outfile.write(2.0);
    outfile.close();
    io::Checkpointer<float> infile(mesh->comm(),
                                   filename.replace_extension("bp"));
    mesh::Mesh<float> mesh_no_ghost
        = infile.read_mesh(2.0, mesh::GhostMode::none);
    mesh::Mesh<float> mesh_shared
        = infile.read_mesh(2.0, mesh::GhostMode::shared_facet);

    std::stringstream cc;
    const int rank = dolfinx::MPI::rank(MPI_COMM_WORLD);
    cc << "-" << rank << "-"
       << "\n";
    cc << "Mesh no ghosts: "
       << mesh_no_ghost.topology()
              ->index_map(mesh_no_ghost.topology()->dim())
              ->size_local()
       << " "
       << mesh_no_ghost.topology()
              ->index_map(mesh_no_ghost.topology()->dim())
              ->num_ghosts()
       << "\n";
    cc << "Mesh shared_facet: "
       << mesh_shared.topology()
              ->index_map(mesh_shared.topology()->dim())
              ->size_local()
       << " "
       << mesh_shared.topology()
              ->index_map(mesh_shared.topology()->dim())
              ->num_ghosts()
       << "\n";
    infile.close();

    io::Checkpointer<float> infile_self(MPI_COMM_SELF,
                                        filename.replace_extension("bp"));
    mesh::Mesh<float> mesh_self
        = infile_self.read_mesh(2.0, mesh::GhostMode::none);
    infile_self.close();
    cc << "MEsh self: "
       << mesh_self.topology()
              ->index_map(mesh_self.topology()->dim())
              ->size_local()
       << " "
       << mesh_self.topology()
              ->index_map(mesh_self.topology()->dim())
              ->num_ghosts()
       << "\n";
    std::cout << cc.str();

    io::VTXWriter<float> vtx_ng(
        mesh_no_ghost.comm(), "mesh_no_ghost.bp",
        std::make_shared<mesh::Mesh<float>>(mesh_no_ghost));
    vtx_ng.write(0.0);
    vtx_ng.close();

    io::VTXWriter<float> vtx_s(
        mesh_no_ghost.comm(), "mesh_shared.bp",
        std::make_shared<mesh::Mesh<float>>(mesh_shared));
    vtx_s.write(0.0);
    vtx_s.close();

    io::VTXWriter<float> vtx_self(
        mesh_self.comm(), "mesh" + std::to_string(rank) + ".bp",
        std::make_shared<mesh::Mesh<float>>(mesh_self));
    vtx_self.write(0.0);
    vtx_self.close();

#endif
  }
  MPI_Finalize();

  return 0;
}

 mpirun -n 3 ./build-dir-real/demo_interpolation-io 
-0-
Mesh no ghosts: 4 0
Mesh shared_facet: 4 2
MEsh self: 12 0
-1-
Mesh no ghosts: 4 0
Mesh shared_facet: 4 2
MEsh self: 12 0
-2-
Mesh no ghosts: 4 0
Mesh shared_facet: 4 4
MEsh self: 12 0

It would be good to get feedback on this design prior to implementing the python interface.

Once this is merged, I can start working on the function checkpointing, with the approach used in https://github.com/jorgensd/adios4dolfinx/blob/main/src/adios4dolfinx/checkpointing.py#L91-L173

jorgensd avatar Apr 11 '23 14:04 jorgensd

Some preliminary comments.

Could this be divided into a 'read' PR, and a checkpoint PR after?

No, as this reads the mesh from the mesh format this PR creates a writer for

jorgensd avatar Apr 27 '23 07:04 jorgensd

Some preliminary comments.

Could this be divided into a 'read' PR, and a checkpoint PR after?

Changes have been made according to your suggestions, any further comments?

jorgensd avatar May 02 '23 08:05 jorgensd

@garth-wells Any further comments?

jorgensd avatar May 16 '23 14:05 jorgensd

@jorgensd could you make clearer in the PR comment what this changes does in practice?

garth-wells avatar Jun 25 '23 11:06 garth-wells

Another round of comments.

Would be helpful to have clear statement in the PR of what functionality is being add.

The code can write meshes to file on N processes, and read it back in on M processes. This is the first step towards having full checkpointing, as a next PR would add in writing/reading functions.

This is the reason for making the checkpointer a class, as one would write the mesh and function to the same file (and keep it alive during the process), avoiding duplicate initialization statements.

What differs in this code from the already existing Fides/VTXWriter is that:

  1. the data is written to global arrays (so that they can be read in again on a different number of processes).
  2. the topology is stored in DOLFINx order, i.e. the dofs are not permuted to VTK or GMSH ordering.

jorgensd avatar Jun 25 '23 12:06 jorgensd

What differs in this code from the already existing Fides/VTXWriter is that:

  1. the data is written to global arrays (so that they can be read in again on a different number of processes).

What's a 'global array'? An ADIOS2 concept?

The VTX files we write do have a local-to-global map array.

  1. the topology is stored in DOLFINx order, i.e. the dofs are not permuted to VTK or GMSH ordering.

Do we want to guarantee that the DOLFINx remains unchanged? An advantage of VTK ordering is that we can freely change the internal DOLFINx ordering.

garth-wells avatar Jul 06 '23 09:07 garth-wells

What's a 'global array'? An ADIOS2 concept?

Yes. Consider writing out a mesh with the VTXWriter in serial:

root@dokken-XPS-9320:~/shared/dolfinx/cpp/demo/poisson# ./build-dir-real64/demo_poisson 
root@dokken-XPS-9320:~/shared/dolfinx/cpp/demo/poisson# bpls -l mesh.bp/
  uint32_t  NumberOfCells        {1} = 1024 / 1024
  uint32_t  NumberOfNodes        {1} = 561 / 561
  int64_t   connectivity         [1]*{1024, 4} = 0 / 560
  double    geometry             [1]*{561, 3} = 0 / 2
  double    step                 scalar = 0
  uint32_t  types                scalar = 69
  uint8_t   vtkGhostType         [1]*{561} = 0 / 0
  int64_t   vtkOriginalPointIds  [1]*{561} = 0 / 560

is what you get in serial, but in parallel you get

root@dokken-XPS-9320:~/shared/dolfinx/cpp/demo/poisson# mpirun -n 4 ./build-dir-real64/demo_poisson 
root@dokken-XPS-9320:~/shared/dolfinx/cpp/demo/poisson# bpls -l mesh.bp/
  uint32_t  NumberOfCells        {4} = 272 / 288
  uint32_t  NumberOfNodes        {4} = 166 / 182
  int64_t   connectivity         [4]*{__, 4} = 0 / 181
  double    geometry             [4]*{__, 3} = 0 / 2
  double    step                 scalar = 0
  uint32_t  types                scalar = 69

which means that the data is split into block structures (equal to the number of processors).

This makes the data very fast to write, but very hard to read on any other number of processes than the one that wrote it.

A global array is what we are creating when we specify both local and global dimensions in put as well as the global offset, like:

adios2::Variable<T> local_geometry = impl_adios2::define_variable<T>(
      io, "geometry", {num_xdofs_global, gdim}, {local_range[0], 0},
      {num_xdofs_local, gdim});

which gives you a .bp file that looks like:

root@dokken-XPS-9320:~/shared/dolfinx/cpp/demo/checkpoint# mpirun -n 4 ./build-dir-real64/demo_interpolation-io 
root@dokken-XPS-9320:~/shared/dolfinx/cpp/demo/checkpoint# bpls -l test.bp/
  uint32_t  CellPermutations  {12} = 0 / 6
  int64_t   connectivity      {12, 3} = 0 / 11
  float     geometry          {12, 2} = 0 / 1
  double    step              scalar = 2

jorgensd avatar Jul 06 '23 10:07 jorgensd