[onert] Improve exception handling by subclassing `std::runtime_error`
Problem
Currently in the runtime core all exceptions are the same (std::runtime_error) with different message. Because of that it is not possible (without comparing message string) to distinguish these errors in the API component. Right now the API returns NNFW_STATUS_ERROR almost everywhere which makes error handling hard (the user of the API can not act differently on different errors).
Proposition
Improve error handling by creating specific errors based on the std::runtime_error. Then the list of possible errors in the API can be extended and instead of returning NNFW_STATUS_ERROR almost everywhere, dedicated error codes could be returned.
If you are OK with such change I can try to implement that.
Right. We need more error type. And we need way to present additional information about error to user if possible.
I think we already has enough error status in nnfw.h:
typedef enum
{
/** Successful */
NNFW_STATUS_NO_ERROR = 0,
/**
* An error code for general use.
* Mostly used when there is no specific value for that certain situation.
*/
NNFW_STATUS_ERROR = 1,
/** Unexpected null argument is given. */
NNFW_STATUS_UNEXPECTED_NULL = 2,
/** When a function was called but it is not valid for the current session state. */
NNFW_STATUS_INVALID_STATE = 3,
/** When it is out of memory */
NNFW_STATUS_OUT_OF_MEMORY = 4,
/** When it was given an insufficient output buffer */
NNFW_STATUS_INSUFFICIENT_OUTPUT_SIZE = 5,
/** When API is deprecated */
NNFW_STATUS_DEPRECATED_API = 6,
} NNFW_STATUS;
and there are several places that returns the correct error status.
NNFW_STATUS nnfw_session::create(nnfw_session **session)
{
if (session == nullptr)
return NNFW_STATUS_UNEXPECTED_NULL;
try
{
auto new_session = std::unique_ptr<nnfw_session>(new nnfw_session());
new_session->_kernel_registry = std::make_shared<onert::api::CustomKernelRegistry>();
*session = new_session.release();
}
catch (const std::bad_alloc &e)
{
std::cerr << "Error during session creation" << std::endl;
*session = nullptr; // Set nullptr on error to keep the old behavior
return NNFW_STATUS_OUT_OF_MEMORY;
}
catch (const std::exception &e)
{
std::cerr << "Error during session initialization : " << e.what() << std::endl;
*session = nullptr; // Set nullptr on error to keep the old behavior
return NNFW_STATUS_ERROR;
}
return NNFW_STATUS_NO_ERROR;
}
Could you please show an example code?
Issues on error handling
NNFW_STATUS_OUT_OF_MEMORYis used for session creation only now- Most errors return
NNFW_STATUS_ERROR- Invalid model
- Unsupported kernel
- Memory allocation fail
- Plugins (backend, loader) loading fail
- NYI
- etc... (all cases throwing exception in runtime)
- Sometimes not throw exception on unexpected case
- Sometimes I find this case on backend implementation
- It's hard to report error to developer
- It's bad to show console output under API implementation
- Driver or app may print error on console by using additional error information, but printing error message by
std::outorstd::cerron runtime level looks strange - We may need different way to send error message for each OS - such as android or tizen log supporting
- Driver or app may print error on console by using additional error information, but printing error message by
My findings are exactly the same as what @hseok-oh has shown in the above comment.
I will raise a PR to fix one of these issues (reporting unsupported data type for operations) hopefully today or on Monday, so you will see whether such refactoring will be an improvement for the API or not.
Also, in addition to returned error code, I thing that the API would need a special function for returning last error message. This will allow to decouole error reporting with stderr, so the API user could show the error message to user on GUI for example.
FYI - old discussion about this issue
Exception handling issue: Improvement error handling Exception type draft: https://github.com/Samsung/ONE/pull/2813
I'd like to take a shot at this issue. I have experience with C++ modernization and API design from my work on HPX (implementing sender/receiver execution models and error propagation) and NVIDIA CCCL.
Looking at PR #16229, I see the pattern you're establishing. I'm thinking I could extend this approach to cover the other error cases that @hseok-oh mentioned particularly:
- Unsupported kernel operations
- Invalid model errors
- Plugin/backend loading failures
- Memory allocation failures (beyond session creation)
I'd follow the same pattern: create specific exception types inheriting from std::runtime_error, map them to new NNFW_STATUS codes, and update the error handling throughout the runtime.
Does this sound like a good direction? Happy to start with one specific category if you'd prefer an incremental approach.
Then the list of possible errors in the API can be extended and instead of returning NNFW_STATUS_ERROR almost everywhere, dedicated error codes could be returned.
I am saying about this. I am not sure that new code shoud be extended.
As @hseok-oh said, the problem is the missing implementation, not missing code.
In fact, I don't see it is a big problem. For most status code, there is nothing to do except for printing the error and exit. I think there are other priority things to do.
It would be better to see the suggested code.
Again, I am not against to improve exception handling. I just want not to add the unnecessary NNFW_STATUS code.