GridapDistributed.jl
GridapDistributed.jl copied to clipboard
Updating to PartitionedArrays v0.5
We have been detached from the development of PartitionedArrays for a while, currently stuck at v0.3. This is also related to #137 .
This PR is an update to PartitionedArrays v0.5, with a substantial rework of the re-assembly process... This comes with several upsides, such as
- split format for distributed vectors and matrices
- re-partitioning of distributed data
- sub-assembled and dis-assembled matrices
- optimisations
Change Log
- FESpace free dof ids are now given by a permuted instance of
LocalIndicesWithVariableBlockSize. We were already doing this (so our numbering strategy does not change), but it's now explicit. This has two advantages:- To find the index partitions for the assembled system, we just un-permute the indices. I.e we get the underlying
LocalIndicesWithVariableBlockSize. - Finding owners for arbitrary gids does now require minimal communications (just a
scanoperation). This is quite handy for assembly, since expanding the ghost ids in the index partition can be done using PArrays'sfind_ownerandunion_ghostmethods.
- To find the index partitions for the assembled system, we just un-permute the indices. I.e we get the underlying
Notes:
Assembly cache reuse strategy
The most involved part of this PR is the assembly cache re-use. The issue is that so far we had been using the output PSparseMatrix as the cache, by allocating extra ghost rows that were only used in the sub-assembly process. This has quite a lot of downsides (see #137), so we are pivoting towards an external assembly cache model.
However, having external caches does not go well with the current Gridap API. Here is a possible solution:
assemble_matrix!will not be supported, except if anAssembleris provided (see next point).- The
Assemblerwill hold the caches. To be able to reuse an assembler for multiple matrices, we will use aDictand the matrix object-id to hold multiple caches (tied to a single matrix each). Cache re-use will be activated by boolean variablereusethat is held by theAssembler. - The
FEOperatorswill keep an instance of their assembler, allowing re-use by default for nonlinear and transientFEOperators.
Tasks from #137
- [ ] We will change the current assembly strategies so that we allow the user to retrieve an assembled vs sub-assembled PSparseMatrix.
- [ ] In both cases, the re-assembly will be done by re-using the COO matrix (instead of directly inserting in the matrix). This will save binary searches, which can be quite costly. The re-assembly caches should be stored somewhere, possibly the FEOperator or Solver.
- [ ] The IndexPartitions in the matrix should be block-wise, so that the owner of a dof can be locally deduced by it's gid. This will save us some communication. I believe we are looking for the struct LocalIndicesWithVariableBlockSize.
- [ ] For performance reasons, we might still want to reorder the dofs in the FESpace so that we have owned then ghosts. Although during integration the memory accesses are still quite random, it will be better performing when we are (eventually) able to use FESpace-allocated vectors with the system matrix.
- [ ] PartitionedArrays will relax the conditions on the ghost layouts for it's matrix-vector product. See https://github.com/fverdugo/PartitionedArrays.jl/issues/127.
Assembly strategies
With PA 0.5, new assembly strategies are available. Also, we've been wanting to fix the confusing names we currently have. I foresee the following assembly strategies:
Assembled: We integrate on owned cells and collect contributions for owned and ghost row ids, then exchange ghost contributions and assemble. CurrentSubAssembledRows.SubAssembled: We integrate on owned cells and collect contributions for owned and ghost row ids, but no exchange. The resulting matrix is sub-assembled, i.e each local matrix contains the contribution for it's owned and ghost rows/cols resulting from integrating over the owned part of the domain.LocallyAssembled: This assumes all contributions can be found by integrating on the local portion of the domain. We integrate on local cells (owned and ghost), and keep contributions for owned row ids. No exchange is needed. CurrentFullyAssembledRows. Comes at your own risk.
@amartinhuertas