Hive-JSON-Serde icon indicating copy to clipboard operation
Hive-JSON-Serde copied to clipboard

Extra Line Feed in Json File creates a extra row in Hive ( and count is incorrect )

Open bjaggi opened this issue 7 years ago • 1 comments

Hello, i am using your serde for nested json mapping and works great.

We have a scenario where we have 2 lines feeds as delimiter ( Seems like hive only supports one \n, one more reason to go with a custom serde).

Same Input File :

{ "id": "1",

"id":"2"
}

when i do select * from hive_table or do count(*) hive is including a extra line feed. Expected output is 2 but hive shows count as 3.

I tried to change some code in this file

Link_To_JSONObject.java_Line318

New Logic : split text based on delimiter \n and then remove lines which are empty after trim. Works fine on the test case, but not when i use in Hive. Any suggestions ?

bjaggi avatar May 17 '18 15:05 bjaggi

Mmm, do you have the complete json you're using ? Like an actual file ? The one you posted should not work at all since the serde only supports one json record per line without \n

rcongiu avatar Aug 11 '20 04:08 rcongiu