GridapDistributed.jl icon indicating copy to clipboard operation
GridapDistributed.jl copied to clipboard

Misc pending tasks associated with refactoring in branch release-0.2

Open fverdugo opened this issue 3 years ago • 11 comments

Merging into master PR https://github.com/gridap/GridapDistributed.jl/pull/50

  • [x] Update README
  • [x] remove src/OLD, test/OLD, compile
  • [x] Add NEWS.md

High priority

  • [x] Release Gridap 0.17.0
  • [x] Release PartitonedArrays 0.2.3
  • [x] Write data into pvtu format (@amartinhuertas) PR https://github.com/gridap/WriteVTK.jl/pull/1 https://github.com/gridap/GridapDistributed.jl/pull/46
  • [x] Create MPI launchers for the test (now we just test in serial mode) @fverdugo https://github.com/gridap/GridapDistributed.jl/pull/45
  • [x] Assembly of the rhs vector alone (@amartinhuertas) PR https://github.com/gridap/GridapDistributed.jl/pull/40
  • [x] Solve broken tests in FESpacesTests.jl namely assemble_XXX! functions do not work @fverdugo https://github.com/gridap/GridapDistributed.jl/pull/44
  • [x] Implement assembly strategy when one also iterates over ghost cells.
  • [x] Interpolation onto distributed multi-field spaces. @fverdugo PR https://github.com/gridap/GridapDistributed.jl/pull/42
  • [x] Port uniformly refined forest of octrees distributed model this new version (in GridapP4est) @amartinhuertas
  • [x] Add test drivers including DG examples. @fverdugo PR https://github.com/gridap/GridapDistributed.jl/pull/42
  • [x] Add test drivers including nonlinear examples. @fverdugo PR https://github.com/gridap/GridapDistributed.jl/pull/44
  • [x] https://github.com/fverdugo/PartitionedArrays.jl/issues/33 https://github.com/fverdugo/PartitionedArrays.jl/pull/35
  • [x] @asserts locally triggered from the processors may lead to deadlock! We should call mpi_abort whenever an AssertionError exception is triggered. PR https://github.com/fverdugo/PartitionedArrays.jl/pull/39
  • [ ] AD in parallel. Talk to @amartinhuertas to grasp the challenges there.

Medium priority

  • [x] Nicer user API for PETScSolver (to be done in GridapPETSc). In particular, better handling of the lifetime of PETSc objects.
  • [x] Implement distributed models from gmsh (in GridapGmsh) @fverdugo
  • [x] implement lu! for PSparseMatrix following the same strategy as with \
  • [x] Move mpi tests to their own CI job @amartinhuertas PR https://github.com/gridap/GridapDistributed.jl/pull/47
  • [x] Automatically generate sysimage to reduce mpi tests runtime @amartinhuertas PR https://github.com/gridap/GridapDistributed.jl/pull/48
  • [ ] Think how the user can define the local and global vector type in the FESpace
  • [ ] Showing NLSolve.jl solver trace only in master MPI rank leads to a dead lock.
  • [ ] Peridodic BCs
  • [ ] Implement ZeroMeanFESpace

Low priority

  • [ ] Implement a more lazy initialization of the matrix exchanger in PartitionedArrasys since the exchanger is not always needed.
  • [x] Overlap compression of the sparse matrix with communication of rhs assembly
  • [ ] Implement another strategy to represent local matrices in PartitionedArrays with a data layout compatible with petsc
  • [x] interpolate_everywhere not available in GridapDistributed.jl, only interpolate Solved in PR https://github.com/gridap/GridapDistributed.jl/pull/74

fverdugo avatar Oct 07 '21 07:10 fverdugo

Let me document here a spare set of issues while I go through GridapDistributed.jl (to not forget, these are temporary, I may add new items on the fly, etc.):

  • [Minor Solved in PR https://github.com/gridap/GridapDistributed.jl/pull/40] DistributedSparseMatrixAssembler does not have strategy member variable, and I think it should, as required by the get_assembly_strategy function (among others).
  • [Important] If you use FullyAssembledRows() strategy in DistributedSparseMatrixAssembler but do not pass the ghost cells in the assembly raw data, the code does not complain. Conversely if one uses SubAssembledRows() + raw data with ghost cells. I think this is VERY dangerous and prone to errors. We should try to find a solution.
  • Why the assemble objects are called Allocation and NOT Allocator ?
  • While the function local_views in GridapDistributed.jl is NOT actually called local_view. Why in plural?
  • More issues to come ...

amartinhuertas avatar Oct 15 '21 08:10 amartinhuertas

Hi @amartinhuertas!

I have opened a new branch https://github.com/gridap/GridapDistributed.jl/tree/release-0.2 from the current state.

We can use this branch as code that "is working" and do intermediate devs somewhere else.

fverdugo avatar Oct 15 '21 14:10 fverdugo

@amartinhuertas I have classified the pending tasks as high/medium/low priority. (Move tasks from one tier not another if you find it necessary)

For me, we can merge the branch once the high-priority test are fixed. The missing low/medium priority ones can be moved to separate issues.

fverdugo avatar Oct 18 '21 09:10 fverdugo

For me, we can merge the branch once the high-priority test are fixed. The missing low/medium priority ones can be moved to separate issues.

Ok. I have added a new high-priority task. Agreed.

amartinhuertas avatar Oct 18 '21 10:10 amartinhuertas

@amartinhuertas I am adding a nonlinear test and I have found a bug associated with the in-place assembly functions assemble_xxx!. I am fixing it.

fverdugo avatar Oct 19 '21 07:10 fverdugo

@amartinhuertas I am adding a nonlinear test and I have found a bug associated with the in-place assembly functions assemble_xxx!. I am fixing it.

Yes, indeed I added @test_broken macro calls in FESpacesTests.j in relation to these. You will find this in the rhs_assembly_branch. When you fix it, you can replace these by @test.

amartinhuertas avatar Oct 19 '21 08:10 amartinhuertas

Hi @amartinhuertas

I have added the nonlinear example. In particular,

  • the in-place assembly functions are working.
  • one can use the parallel assembly strategy to build a valid triangulation e.g. Triangulation(with_ghost,model) and Triangulation(FullyAssembledRows(),model) are equivalent.

fverdugo avatar Oct 19 '21 15:10 fverdugo

Hi @amartinhuertas, @fverdugo, what are the developments needed for periodic BCs?

oriolcg avatar Jan 17 '22 14:01 oriolcg

At least, the ghost layer has to be implemented taking into account periodicity. And perhaps other devs.

fverdugo avatar Jan 17 '22 15:01 fverdugo

Hi @amartinhuertas and @fverdugo, the last task of the list can be checked out once PR #74 is accepted. Also, I suggest to add another task into the list: support ZeroMeanFESpace. I guess that this is only a matter of selecting only one local space to constrain a DOF and use the global volume instead of the local volume as a member of the struct. What do you think about this?

oriolcg avatar Feb 02 '22 10:02 oriolcg

the last task of the list can be checked out once PR #74 is accepted.

Great @oriolcg ! Thanks for your work!

support ZeroMeanFESpace

Sure. Task added.

I guess that this is only a matter of selecting only one local space to constrain a DOF and use the global volume instead of the local volume as a member of the struct. What do you think about this?

I indeed implemented this in GridapDistributed.jl v0.1.0. See https://github.com/gridap/GridapDistributed.jl/blob/v0.1.0/src/ZeroMeanDistributedFESpaces.jl and https://github.com/gridap/GridapDistributed.jl/blob/v0.1.0/test/ZeroMeanDistributedFESpacesTests.jl. I guess it should be a matter to re-write this using the new code organization in v0.2.0.

amartinhuertas avatar Feb 02 '22 10:02 amartinhuertas