Peter Heywood issues

Results 100 issues of


                                            Peter Heywood

Python / NVRTC performance (CUDA 12.2+)

Recent runs of the python test suite (CUDA 12.0, 535.104.05, Python 3.12) took a significant length of time to run under linux ``` 650 passed, 11 skipped, 69 warnings in...

CUDA 12.4+ NVRTC `-minimal`

CUDA 12.4 introduces: > Add a new flag -minimal for NVRTC compilation. The -minimal flag omits certain language features to reduce compile time for small programs. In particular, the following...

Fix Spatial Wrapped invalid env/radii detection

+ [x] Reproduce python-native edge case in c++ test suite + [ ] Expand edge case tests coverage to exposes a broader range of floating point issues + [ ]...

Unified memory for device oversubscrption

Implementing a CUDA managed memory implementation would enable oversubscription of the GPU on some systems (linux with pascal+, though volta+ might be a better choice for perf reasons). This would...

enhancement

Agent type for low-population highly parallel workload agents

FLAME GPU 2 is not currently usable for all types of agent models, primarily targetting models with many (~10k/100k/1000k) realtively light/small agents (low per-agent memory). This is due to the...

It might be nice to make FLAME GPU 2 installable via spack in the future. + https://spack-tutorial.readthedocs.io/en/latest/tutorial_packaging.html + https://spack.readthedocs.io/en/latest/packaging_guide.html# Likely depends on (or atleast overlaps with) #260 and #317

Post-release checklist

Create a post-release checklist as part of contributing.md, listing other repos that will need updating / testing. This mainly needs to be the template repos + the tutorial(s).

Documentation

Wrapped MsgSpatial interaction radius factor bug

#1160 added checks that comm radii are a factor of the environment when using spacial comms with seatbelts enabled. This seems to be incorrectly triggering in some cases, including the...

bug

ARM pyflamegpu binary wheels (Cross-compilation)

Grace-Hopper and Jetson systems have an ARM host, rather than x86-64. Github actions does not provide ARM runners, so cross compilation would be the best option for generating binary wheels...

cmake

Github Action pull_request behaviour

## tldr + `pull_request` CI runs are semi-useful at best, spam at worst. + Do we want to reduce the frequency of them by either: 1. Do not run on...