FLAMEGPU2
FLAMEGPU2 copied to clipboard
Feature: Instrumentation / Profiling tools
Add instrumentation tools for simple performance analysis, along with NVTX based markers for more advanced profiling. Introduce a profile build configuration which enables this
- [ ] Instrumentation
- [x]
--timing
flag for simple timing - [x] Make timing values programatically accessible, rather than just output to stdout? This would need to be recorded regardless of the presence of
--timing
- [x] Simulation timing
- [ ] Pre/Post Simulation timing (for transparency).
- [x] Timing per Iteration - there will be variance in some iterations due to buffer growth etc.
- [ ] Timing per layer (per agent function not viable if concurrency is enabled)
- [x]
- [x] NVTX
- [x] NVTX linked
- [x] NVTX utiltiity class (
include/util/nvtx.h
) - ~Tests (not sure how)~ - not programatically testable?
- [x] Doxygen docs (guarded by macros)
- [x] Better NVTX ranges (probably during #379).
- [x] Include a range for the call to
simulate
- [ ] Ideally capture Agent population generation - may have to go in the indiviudal model, or as part of #246
- [x] Include a range for the call to
- [ ] Output simulation/ensemble information to disk, (potentially/optionally) including timing info (total, per step etc?), population data, GPU executed on, Driver version, CUDA version, FLAMEGPU version, user-provided model verisoning? etc.
For Cmake, enabling NVTX could be handeld something along the lines of: (but maybe just for the profile target, and make it so that if NVTX is not found it builds without)
# Switch to enable NVTX ranges for profiling
option(USE_NVTX "Build with NVTX" ON)
if(USE_NVTX)
message("-- Using NVTX")
find_library(NVTX_LIBRARY nvToolsExt HINTS /usr/local/cuda/lib64)
target_link_libraries(flamegpu2 ${NVTX_LIBRARY})
set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS}; -DUSE_NVTX)
endif(USE_NVTX)
It might be useful to add a simulation / ensemble config flag/argument to control the output of population / step times / other simulation meta-data as csv/json.
I.e. simulatein.exe --output-metadata path/to/file.json
.
When this property / flag is set in the config, track populations, timing, GPU used and output them to a json file of a specific format.
Alternatively output multiple CSV's (as CSV's must be rectangular). So per step data in one csv, and general per sim data in another.
Ensembles could output to a single json file, or multiple in a directory depending on implementation. outputting to individual then re-forming into a single file and cleaning up is another possibility (to reduce memory requirements for incredibly large ensembles.
More fine-grained control might also be useful. I.e. just output metadata such as flamegpu build number, cuda / nvrttc version etc to disk in some cases, track per ensemble / sim totatl time, or track finer grained timing / population sizes.
Submodels will make this more complicated too if tracking submodels are required / in general when tracking long running simulations.