h2gis
h2gis copied to clipboard
ST_Explode performance issue
It seems that the ST_Explode function is badly implemented.
In debug mode the test https://github.com/orbisgis/h2gis/blob/master/h2gis-functions/src/test/java/org/h2gis/functions/spatial/SpatialFunctionTest.java#L126 shows that the function is visited 3 times instead of 1.
Don't know if we have the same bug on H2GIS 1.5.
@nicolas-f @SPalominos @gpetit @j3r3m1 @ELSW56
Sounds like a quite huge amount of time saved if solved !! =)
Yes it is visited multiple times because it needs to get the fields name. The optimisation has already done as it will effectively iterate over the source table only once
http://www.h2database.com/html/features.html#user_defined_functions
A function that returns a result set can be used like a table.
However, in this case the function is called at least twice:
first while parsing the statement to collect the column names (with parameters set to null where not known at compile time).
And then, while executing the statement to get the data (maybe multiple times if this is a join).
Thanks @nicolas-f The iteration is done each time H2 calls the function. So in this case (ST_Explode), the parseRow() method is calling 3 times. Don't know if it's an H2 limitation but I'd like to call the parseRow() only once by SQL query.
@katzyn We use extensively the table function to process large geometry table. ST_Explode to explode collection of geometries to single one or ST_MakeGrid to build a square grid on a geometry bounding box are both examples. I'm looking for a way or workaround to visit only once the table functions. I understand that the function is called at least twice but I'd like to know if it's possible to use the same behaviour as aggregate function that offers a init and getResult methods.
External table value functions in H2 always had various defective by design implementations and fixes for some issues broke other things. They need a brand new API designed in a some reasonable way.