GridapDistributed.jl Updating to PartitionedArrays v0.5

Updating to PartitionedArrays v0.5

Open JordiManyer opened this issue 1 year ago • 0 comments

We have been detached from the development of PartitionedArrays for a while, currently stuck at v0.3. This is also related to #137 .

This PR is an update to PartitionedArrays v0.5, with a substantial rework of the re-assembly process... This comes with several upsides, such as

split format for distributed vectors and matrices
re-partitioning of distributed data
sub-assembled and dis-assembled matrices
optimisations

Change Log

FESpace free dof ids are now given by a permuted instance of LocalIndicesWithVariableBlockSize. We were already doing this (so our numbering strategy does not change), but it's now explicit. This has two advantages:
- To find the index partitions for the assembled system, we just un-permute the indices. I.e we get the underlying LocalIndicesWithVariableBlockSize.
- Finding owners for arbitrary gids does now require minimal communications (just a scan operation). This is quite handy for assembly, since expanding the ghost ids in the index partition can be done using PArrays's find_owner and union_ghost methods.

Notes:

Assembly cache reuse strategy

The most involved part of this PR is the assembly cache re-use. The issue is that so far we had been using the output PSparseMatrix as the cache, by allocating extra ghost rows that were only used in the sub-assembly process. This has quite a lot of downsides (see #137), so we are pivoting towards an external assembly cache model.

However, having external caches does not go well with the current Gridap API. Here is a possible solution:

assemble_matrix! will not be supported, except if an Assembler is provided (see next point).
The Assembler will hold the caches. To be able to reuse an assembler for multiple matrices, we will use a Dict and the matrix object-id to hold multiple caches (tied to a single matrix each). Cache re-use will be activated by boolean variable reuse that is held by the Assembler.
The FEOperators will keep an instance of their assembler, allowing re-use by default for nonlinear and transient FEOperators.

Tasks from #137

[ ] We will change the current assembly strategies so that we allow the user to retrieve an assembled vs sub-assembled PSparseMatrix.
[ ] In both cases, the re-assembly will be done by re-using the COO matrix (instead of directly inserting in the matrix). This will save binary searches, which can be quite costly. The re-assembly caches should be stored somewhere, possibly the FEOperator or Solver.
[ ] The IndexPartitions in the matrix should be block-wise, so that the owner of a dof can be locally deduced by it's gid. This will save us some communication. I believe we are looking for the struct LocalIndicesWithVariableBlockSize.
[ ] For performance reasons, we might still want to reorder the dofs in the FESpace so that we have owned then ghosts. Although during integration the memory accesses are still quite random, it will be better performing when we are (eventually) able to use FESpace-allocated vectors with the system matrix.
[ ] PartitionedArrays will relax the conditions on the ghost layouts for it's matrix-vector product. See https://github.com/fverdugo/PartitionedArrays.jl/issues/127.

Assembly strategies

With PA 0.5, new assembly strategies are available. Also, we've been wanting to fix the confusing names we currently have. I foresee the following assembly strategies:

Assembled : We integrate on owned cells and collect contributions for owned and ghost row ids, then exchange ghost contributions and assemble. Current SubAssembledRows.
SubAssembled : We integrate on owned cells and collect contributions for owned and ghost row ids, but no exchange. The resulting matrix is sub-assembled, i.e each local matrix contains the contribution for it's owned and ghost rows/cols resulting from integrating over the owned part of the domain.
LocallyAssembled : This assumes all contributions can be found by integrating on the local portion of the domain. We integrate on local cells (owned and ghost), and keep contributions for owned row ids. No exchange is needed. Current FullyAssembledRows. Comes at your own risk.

@amartinhuertas

Aug 30 '24 02:08 JordiManyer

GridapDistributed.jl GridapDistributed.jl copied to clipboard

Updating to PartitionedArrays v0.5

Change Log

Notes:

Assembly cache reuse strategy

Tasks from #137

Assembly strategies

GridapDistributed.jl
GridapDistributed.jl copied to clipboard