Peter Heywood
Peter Heywood
Recent runs of the python test suite (CUDA 12.0, 535.104.05, Python 3.12) took a significant length of time to run under linux ``` 650 passed, 11 skipped, 69 warnings in...
CUDA 12.4 introduces: > Add a new flag -minimal for NVRTC compilation. The -minimal flag omits certain language features to reduce compile time for small programs. In particular, the following...
+ [x] Reproduce python-native edge case in c++ test suite + [ ] Expand edge case tests coverage to exposes a broader range of floating point issues + [ ]...
Implementing a CUDA managed memory implementation would enable oversubscription of the GPU on some systems (linux with pascal+, though volta+ might be a better choice for perf reasons). This would...
FLAME GPU 2 is not currently usable for all types of agent models, primarily targetting models with many (~10k/100k/1000k) realtively light/small agents (low per-agent memory). This is due to the...
It might be nice to make FLAME GPU 2 installable via spack in the future. + https://spack-tutorial.readthedocs.io/en/latest/tutorial_packaging.html + https://spack.readthedocs.io/en/latest/packaging_guide.html# Likely depends on (or atleast overlaps with) #260 and #317
Create a post-release checklist as part of contributing.md, listing other repos that will need updating / testing. This mainly needs to be the template repos + the tutorial(s).
#1160 added checks that comm radii are a factor of the environment when using spacial comms with seatbelts enabled. This seems to be incorrectly triggering in some cases, including the...
Grace-Hopper and Jetson systems have an ARM host, rather than x86-64. Github actions does not provide ARM runners, so cross compilation would be the best option for generating binary wheels...
## tldr + `pull_request` CI runs are semi-useful at best, spam at worst. + Do we want to reduce the frequency of them by either: 1. Do not run on...