cuda.core.system: Better checks for when we expect APIs to be unsupported
Prior to this PR, unit tests in cuda.core.system would just always skip the test if an API returned an NotSupportedError. This meant that a test could be skipped even when the documentation suggests that it should in fact work. This PR makes it so such a skip would only be acceptable if the API is documented as not being supported for a given architecture.
There is some nuance, though. Some of these APIs are still failing as not supported even when the documentation says they should be. This may be because some other aspect of the device (not just the architecture) makes it unsupported. Unfortunately, the NVML headers don't offer any clues there, but at least now we know where all of these cases are (because they are noted with unsupported_before(device, None), and we can follow up by filing documentation bugs with NVML and refining the criteria by which tests are skipped even further.
This also removes the behavior of some tests where it would collect different skip reasons for each device and report all of them. This added a lot of complexity for little benefit -- it is probably very rare that a system would have two devices of different architectures, especially on systems that we would use for testing -- so I just took all of that out to simplify things.
Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.
Contributors can view more details about this message here.
/ok to test
Doc Preview CI :---: |
:rocket: View preview athttps://nvidia.github.io/cuda-python/pr-preview/pr-1510/
|
https://nvidia.github.io/cuda-python/pr-preview/pr-1510/cuda-core/
|
https://nvidia.github.io/cuda-python/pr-preview/pr-1510/cuda-bindings/
|
https://nvidia.github.io/cuda-python/pr-preview/pr-1510/cuda-pathfinder/
|
Preview will be ready when the GitHub Pages deployment is complete.
/ok to test
/ok to test
/ok to test
/ok to test
/ok to test
/ok to test
/ok to test
Just logging a concern: While this sounds reasonable today, it could be very surprising in the future, because a skip for one device may mask issues with another, and may silently reduce test coverage in a way that's easy to miss. Skips generally look harmless, while these skips are maybe not. Ideally the potentially failure-masking skips should come with a warning; in this case, only if there are more devices left to loop over, which isn't so easy to get right I guess.
Thinking about this more, I think we can write a test for this case, so at least it might offer a clue to our future selves. I'll add it here.
/ok to test
/ok to test
Doc Preview CI :---: Preview removed because the pull request was closed or merged.