SIMULATeQCD Finish documentation

Fix todo's in documentation

Nov 22 '21 11:11 lukas-mazur

Add missing documentation for:

[x] General structure of the code (merge with existing wikipage "How to organize new files")
[x] CommunicationBase
[ ] Spinorfield (stacks!)
[ ] CUDA P2P, CUDA-aware-MPI, CUDA IPC (meaning for speedup and memory consumption (halos!))
[x] functor syntax (general idea, universal across gaugefields, spinorfields, LatticeContainer,.... adopt from src/testing/main_GeneralFunctorTest.cu)
[ ] Compression types, mixed precision
[ ] Inverter (explain the various CG flavors we have)
[ ] Dslash (explain what members do to their arguments, e.g. multiply with Deo)
[ ] explain our CMakeLists.txt
[ ] Halos, halo exchange (what/why, Gaugefield/Spinorfields can do exchanges, LatticeContainer can't...)
[ ] Reduction
[x] Something about template usage in our code and explicit instantiation preprocessor magic (Halodepth, Stacksize, etc...)
[x] modules (RHMC!)

Nov 22 '21 12:11 luhuhis

[ ] We should also write a beginner tutorial based on GenralFunctorTest

Nov 23 '21 12:11 lukas-mazur

To add to the things Luis listed:

[x] Gauge fixing: There is a gauge fixing application already. So move "gauge fixing" article under applications section, and just explain that application and how to add a new observable to that code. Explain Observables nicely like the gradient flow wiki does. In addition add some information about the Polyakov loop correlators to the gradient flow article.
[ ] GIndexer and HaloIndexer: This is a complicated class especially when using multi-gpu. Explain all of the functions inside of the class and give examples. Why do we put the lattice dimension information in the constant buffer of the GPU?
[ ] Memory layouts and accessors: The LatticeContainer stores an element at each index, while an element could be for example a SU3 matrix. That means that all 9 complex values of one SU element lie subsequently in memory at an index position. The gaugefield and the spinorfield on the other hand are optimized for coalesced memory access. For example in case of the spinorfield an element is a vector of 3 complex values. These elements basically get split up in memory. The first entries of all vectors in the spinorfield are stored subsequently in memory, followed by all second entries and all third entries.

Dec 20 '21 14:12 clarkedavida

Another thing that's missing:

[x] how to use the rat_approx executable. There's a PDF for it but we should merge that into the docs

Jun 08 '22 09:06 luhuhis

I tried to clean up and complete more documentation before we put up our paper. Any TODOs that I did not handle, I decided to just list here for future reference.

[ ] Memory management: Add examples showing how Accessors are used.
[ ] Integrator: There was a TODO called "HISQ smearing after gauge update"... I'm not sure what was meant.

Nov 27 '22 06:11 clarkedavida