ChoETL icon indicating copy to clipboard operation
ChoETL copied to clipboard

Nested JSON to CSV resulting in single row with multiple columns instead of multiple rows

Open rohit3d2003 opened this issue 4 years ago • 4 comments
trafficstars

I am trying to convert a dynamic nested JSON to CSV and it ends up creating CSV with single row with multiple headers. Refer below example

JSON: { "Latitude": "10.00", "Longitude": "11.00", "Floor": [ { "Floor ID": "FL01", "Toilet": [ { "Sink": "Test", "Handdryer": "Test", "Urinal": "10" }, { "Sink": "5", "Handdryer": "Test", "Urinal": "Test" } ] }, { "Floor ID": "FL02", "Floor Level": "Level 2", "Floor Name": "Floor2", "Cubicles": { "Desk Type": "20", "Foot Rest": "30", "Separator Type": "Test" } } ], "Elevator": [ { "Number": "Elevator1" }, { "Number": "Elevator2", "Led Light": { "Number": "123" } } ] }

Output CSV: Screen Shot 2021-04-25 at 11 44 07 AM

Expected Output: Screen Shot 2021-04-25 at 11 52 49 AM

Code Snippet: Screen Shot 2021-04-25 at 11 54 26 AM

rohit3d2003 avatar Apr 25 '21 18:04 rohit3d2003

Well, this is some complex requirement. Will need heavy lifting to achieve the expected output. Here is how you can do it with Cinchoo ETL

PS: Pls take the latest version from nuget, v1.2.1.15

string jsonFilePath = "sample141.json";
List<dynamic> objs = new List<dynamic>();
using (var r = new ChoJSONReader(jsonFilePath)
    .IgnoreField("Elevator")
    .UseJsonSerialization()
    )
{
    var r1 = r.FlattenBy("Floor", "Toilet");
    objs.AddRange(r1.ToArray());
}
using (var r = new ChoJSONReader(jsonFilePath)
    .IgnoreField("Elevator")
    .UseJsonSerialization()
    )
{
    var r1 = r.FlattenBy("Floor", "Cubicles");
    objs.AddRange(r1.ToArray());
}

using (var r = new ChoJSONReader(jsonFilePath)
    .WithJSONPath("$..Elevator")
    .UseJsonSerialization()
    )
{
    objs.AddRange(r.ToArray());
}

StringBuilder csv = new StringBuilder();
using (var w = new ChoCSVWriter(csv)
    .WithFirstLineHeader()
    .ThrowAndStopOnMissingField(false)
    .WithMaxScanRows(10)
    )
{

    w.Write(objs);
}

Console.WriteLine(csv.ToString());

Output:

Latitude,Longitude,Floor ID,Sink,Handdryer,Urinal,Floor Level,Floor Name,Desk Type,Foot Rest,Separator Type,Number,Led Light_Number
10.00,11.00,FL01,Test,Test,10,,,,,,,
10.00,11.00,FL01,5,Test,Test,,,,,,,
10.00,11.00,FL02,,,,Level 2,Floor2,20,30,Test,,
,,,,,,,,,,,Elevator1,
,,,,,,,,,,,Elevator2,123

Cinchoo avatar Apr 26 '21 00:04 Cinchoo

Thanks for suggesting this solution. Our challenge is that JSON is dynamic and we don’t know schema upfront. Even the nested arrays are not deterministic. I may have to come up with an approach of doing it generically. Any suggestions along those lines?

rohit3d2003 avatar Apr 26 '21 03:04 rohit3d2003

I doubt there is straight approach for your requirement.

Cinchoo avatar Apr 27 '21 22:04 Cinchoo

The requirement is any nested array in json should result in multiple rows of data instead of multiple columns. I have seen lot of posts where people are looking to structure csv in this way and I am gonna give it a shot. Will keep you posted if I make any progress

rohit3d2003 avatar Apr 28 '21 03:04 rohit3d2003