csv-express icon indicating copy to clipboard operation
csv-express copied to clipboard

Guarantee column order of objects. With headers.

Open zevero opened this issue 9 years ago • 8 comments

Order is never guaranteed in Objects, so I wonder that it worked for you up until now...

zevero avatar May 24 '16 06:05 zevero

It has always worked because v8 is very consistent with Object key ordering, with the exception of integers as keys - http://stackoverflow.com/questions/280713/elements-order-in-a-for-in-loop/280861#280861

I made some edits on a different branch that should guarantee a consistent order. Could you please take a look and let me know if it meets your needs? https://github.com/jczaplew/csv-express/tree/ordered

Thank you for your contribution!

jczaplew avatar May 24 '16 14:05 jczaplew

oh ... I did not mean Object key ordering, which is consistent in most browsers and v8 (albeit not in spec). And yes in my code I could have avoided to have both csvHeader_arr and csvHeader_obj if I had chosen to rely on it ;)

My pull request addresses another issue: My array of object comes from a mongodb and key order is dependent on the order they were saved. Some objects have missing keys, some have more keys.

So if you have a look in my code you can see, that I do not simply take the keys from the first object in the array, but I collect all keys in all of the objects in the order they come and give them a fixed position, which ensures that the same key (values) are in the same column (with correct key in header).

zevero avatar May 25 '16 08:05 zevero

+1

I see a critical issue when you have inputs like this:

Works: [{a: 1, b: 2}, {a: 3, b: 4}]

Column 'b' will be missing: [{a: 1}, {a: 3, b: 4}]

So instead of relying on the first object, we would have to create a full table index based on all inputs. What do you think?

amadeus-torwell avatar May 22 '18 10:05 amadeus-torwell

This is exactly why I have opened this pull request 2 years ago ... yes I am creating a full table index.

zevero avatar May 22 '18 10:05 zevero

@deusama and @zevero - I see the issue, but it seems that creating a full table index could result in a large performance penalty, both in terms of parsing speed and potentially memory usage.

One way this could be addressed is by modifying the package to accept a list of headers, i.e. declaring the object order a priori. It would look something like this:

let headers = ['a', 'b']
let data = [ { 'a': 1}, {'b': 2, 'a': 3} ]

let body = ''
// Loop through the data
for (let i = 0; i < data.length; i++) {
  // Iterate on the known headers
  for (let header = 0; header < headers.length; header++) {
    // For each data object, check if it contains the known key
    if (data[i][headers[header]] {
       body += data[i]
    }
  }
}

I'd still like to benchmark this, but it removes the necessity of looping through all of the data twice, once to find all headers and again to parse the data.

Thoughts?

jczaplew avatar May 22 '18 14:05 jczaplew

Thanks for the quick reply @jczaplew!

Could be done this way, or using a configuration flag having an option for 'first-row-is-head' sort of thing. So the header could be within the data: [{a, b}, {a: 1}, {a: 3, b: 4}]

And a second thought - yes I like the idea of having a configuration here. So we can specify the data we want to see in the csv outputs.

amadeus-torwell avatar May 22 '18 14:05 amadeus-torwell

Have a look at my Version of the code https://github.com/zevero/csv-express/blob/b60518d748f2d7df41fcc374c5f29bd308d8089a/lib/csv-express.js

There is no performance penality. As data is parsed there is just a check if the key is known. If not it is simply added.

Of course the header could be predefined. But I would rather not like to bother. For me export-csv should JUST export whatever data there is into csv format. For me, it was just important, that the values, don't end up under the wrong header.

zevero avatar May 22 '18 14:05 zevero

Sorry to ask, but do we have progress? :-)

Thanks, Amadeus

amadeus-torwell avatar Jun 22 '18 08:06 amadeus-torwell