parsec
parsec copied to clipboard
show-caps: don't report flops for unknown cuda devs, report peer access
show-caps:
- don't report flops for unknown cuda devs (report 0.0, like cpus in #663)
- report peer access mask
- report cpu avx/simd instruction if detected available
The diff looks bigger that what it actually is because I had to move the cuda show_caps to after all_devices_attached
to be able to report peer-access, so its mostly copy pasting from cuda_module_init
to all_devices_attached
- [x] undo the change that reorders the caps, after using this PR for a while the peer-access mask is not super relevant info