arroyo
arroyo copied to clipboard
Spec needed: Error handling strategy in Arroyo SQL Functions
As I tried to implement https://github.com/ArroyoSystems/arroyo/issues/25, I had to face two type of errors:
- errors coming from the invalid regexp, which can be handled at the time the AST is generated / the SQL is parsed
- errors coming from the underlying implementation, which in case of arrow could lead to an error https://github.com/apache/arrow-rs/blob/master/arrow-string/src/regexp.rs#L42
While it is probably true that the Arrow implementation returns error because it receive a String and not a regexp, and try to build the regexp internally, I think unwrapping the error assuming it will never lead to panicking is not ideal. That would be relying on internal implementation details rather than the signature/the types, so I think we should probably come up with a strategy for mapping lower level errors into Arroyo specific errors.
DataFusion also returns an algebraic data type from its API, so I think we need to clarify what's the approach should be for arroyo https://github.com/apache/arrow-datafusion/blob/8a112484ac7ae89afc7006d56c65fba2dab106ce/datafusion/physical-expr/src/regex_expressions.rs#L54