UDTF crash server if `return != len(input)`
The server crashes considering the following example, with input vectors of more than 2 and returning 2:
@omnisci('int32(Column<int32>, OutputColumn<int32>)')
def example(input, out):
size = len(input)
for i in range(size):
out[i] = input[i]
return 2
The server likely crashes because no memory has been allocated to the output parameter out. Use:
@omnisci('int32(Column<int32>, OutputColumn<int32>)')
def example(input, out):
size = len(input)
set_output_row_size(size)
for i in range(size):
out[i] = input[i]
return size
On the other hand, avoiding the server crash on the issue example may require analyzing the generated code and the corresponding signature. For instance, when the input specifies no sizer arguments and the body makes no call to set_output_row_size function then the resulting operator will likely crash the server. Another approach would be to implement a range check on indexing input and output columns so that running the above example on the server would result in an index error but it would keep the server alive.
But then how to only return a slice? I thought this return could be used for that?
For crashing the server, would there be a way to use some sort of sandbox or pre-validation? It would be good to check the function when it's being registered so that a user cannot crash the server.
But then how to only return a slice? I thought this return could be used for that?
There are (perhaps too many) number of ways to specify the size of output columns and each has its advantages/disadvantages. I'll give a summary elsewhere.
For crashing the server, would there be a way to use some sort of sandbox or pre-validation? It would be good to check the function when it's being registered so that a user cannot crash the server.
Sure, it would be desired but technically it is not trivial. For instance, a pre-validation requires generating sample inputs to table functions which means if a table function defines restrictions on arguments, the samples must obey these as well. And even then, one can likely construct a function that can be made to crash the server on specific inputs while on samples the function execute work well.