gis-tools-for-hadoop icon indicating copy to clipboard operation
gis-tools-for-hadoop copied to clipboard

Error with ST_LineString when running query below

Open mpharding opened this issue 7 years ago • 3 comments

In the Hadoop YARN log for a container I am seeing these errors:

2016-07-12 20:10:55,516 [ERROR] [TezChild] |hive.ST_LineString|: Internal error - ST_LineString: java.lang.NullPointerException. 2016-07-12 20:10:55,517 [ERROR] [TezChild] |hive.ST_SetSRID|: Invalid arguments - one or more arguments are null. 2016-07-12 20:10:55,517 [ERROR] [TezChild] |hive.ST_GeodesicLengthWGS84|: Invalid arguments - one or more arguments are null.

The query im running is:

select PreQuery.name, sum(case when PreQuery.Geode < 10.0 then 1 else 0 end) 10mCount, sum(case when PreQuery.Geode < 50.0 then 1 else 0 end) 50mCount, sum(case when PreQuery.Geode < 1000.0 then 1 else 0 end) 100mCount from ( select a.name, ST_GeodesicLengthWGS84( ST_SetSRID( ST_LineString(a.lat, a.lon, b.lat, b.lon),4326)) as Geode from a, b) PreQuery GROUP BY PreQuery.name ORDER by 1000mCount desc

When I run this on a few thousand records it works fine but when I run on over 54k I see these problems.

Any ideas why?

mpharding avatar Jul 12 '16 19:07 mpharding

It looks like ST_LineString is returning a null and ST_GeodesicLengthWGS84 is logging the error because the geometry is null. My guess is that one or more of your records in the larger dataset has invalid/null values for lat and lon, which is causing ST_LineString to return null.

climbage avatar Jul 13 '16 15:07 climbage

Hmm, the log entry above makes it look like an ST_Geometry function is throwing NPE when it should instead log invalid null argument.

randallwhitman avatar Jul 20 '16 16:07 randallwhitman

@hardboy111 Were you able to double check your data to see if any of your records in the larger dataset have invalid or null values for your lat and lon?

GISDev01 avatar Oct 10 '16 01:10 GISDev01