cuda-python
cuda-python copied to clipboard
[FEA]: Relevant exceptions for cuCheckpointProcessGetState
Is this a duplicate?
- [x] I confirmed there appear to be no duplicate issues for this request and that I agree to the Code of Conduct
Area
cuda.bindings
Is your feature request related to a problem? Please describe.
Very small issue, but not sure if it expands to other functions that I have not tested. For cuCheckpointProcessGetState, sending a PID that doesn't exist or PID not valid to be checkpointed results in the following err:
>>> cu.cuCheckpointProcessGetState(123434)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "cuda/bindings/driver.pyx", line 44467, in cuda.bindings.driver.cuCheckpointProcessGetState
File "/usr/lib64/python3.11/enum.py", line 714, in __call__
return cls.__new__(cls, value)
^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.11/enum.py", line 1137, in __new__
raise ve_exc
ValueError: 32718 is not a valid CUprocessState
I have seen some other values other than 32718 show up as the returned CUprocessState as well, seemingly random.
Describe the solution you'd like
Consistent exceptions for common failures such as PID not existing or being invalid. The cuda-checkpoint CLI gives the following message which would be fine
Error getting process state for process ID 1234234344: "OS call failed or operation not supported on this OS"
Describe alternatives you've considered
No response
Additional context
Inconsistent/irrelevant exceptions makes unit testing around this area of the cuda driver difficult/messy.