Peter Heywood
Peter Heywood
The ValidateIDCollisions method / kernels(s) run when agent populations are set. If the population was created at runtime (i.e. not loaded from disk) and theh population was empty originally then...
Once #379 is merged, there are still improvements that can be made to the use of streams to improve performance through the use of streams, within a single simulation and...
D2H / H2D transfers see increased bandwidth by transferring to/from pinned memory (2x+). The downside is that pinning too much memory on a given system can lead to excessive paging...
For F1 parity need to be able to write out state lists to disk every N iterations, when `simulate` is used. via the config / cli: + specify format +...
Now with extra environment variables, and rtc/swig/python implementation having an install step makes sense. This would need to set environment variables, copy files to appropriate locations etc. Care must be...
Cmake uses the appropriate compiler based on the extension of teh file. i.e `.cpp` files will use `g++` while `.cu` files will use `nvcc`. Visual studio with cuda enabled will...
Add instrumentation tools for simple performance analysis, along with NVTX based markers for more advanced profiling. Introduce a profile build configuration which enables this + [ ] Instrumentation + [x]...
Currently we have `CUDAEnsemble`, but no base `Ensemble` class (i.e. to match `Simualtion` and `CUDASimulation`. We could: + Add a very simple `Ensemble` base class which would intend to be...
String literals for reserved variable names such as `_stepCount`, `_id`, `_agent_birth` etc. are bad. RTC/Jitify means there is a lot of forward declaration in headers, rather than includes, so we...
The stanage pytorch documentation currently states that nightly builds must be used on the h100 nodes. From pypi, pytorch 2.1.0+ (October 4th 2023) is a CUDA 12.1 build which supports...