impala-get-json-object-udf
impala-get-json-object-udf copied to clipboard
Nested JSON breaks connection to impala
Impala version 2.8
UDF breaks connection upon trying to deal with nested arrays
example JSON:
{"customer_info":[{"field_name":"family_names","field_value":"Gonzalez"},{"field_name":"given_names","field_value":"Pablo"}],"phone":null}
this works
select json_get_object('{"customer_info":[{"field_name":"family_names","field_value":"Gonzalez"},{"field_name":"given_names","field_value":"Pablo"}],"phone":null}','$.customer_info') ;
but this breaks impala
select json_get_object('{"customer_info":[{"field_name":"family_names","field_value":"Gonzalez"},{"field_name":"given_names","field_value":"Pablo"}],"phone":null}','$.customer_info.field_name') ;
@scratch28
replace this line
https://github.com/nazgul33/impala-get-json-object-udf/blob/49e151f7cff9686a1197ca9283bcd847e4470812/jsonUdf.cc#L71
to
if (va.IsObject() && va.HasMember(key)) { \
and recompile.
IsObject()
needs to be checked before calling HasMember
.
See from rapidson/document.h
:
#if RAPIDJSON_HAS_STDSTRING
//! Check whether a member exists in the object with string object.
/*!
\param name Member name to be searched.
\pre IsObject() == true
\return Whether a member with that name exists.
\note It is better to use FindMember() directly if you need the obtain the value as well.
\note Linear time complexity.
*/
bool HasMember(const std::basic_string<Ch>& name) const { return FindMember(name) != MemberEnd(); }
#endif
\pre IsObject() == true
, I thint that means pre-condition.