crawlee icon indicating copy to clipboard operation
crawlee copied to clipboard

[Bug] Object key order matters when exporting to CSV - ideally shouldn't be?

Open nktnet1 opened this issue 1 year ago • 0 comments

Which package is this bug report for? If unsure which one to select, leave blank

@crawlee/core

Issue description

  1. Create a dataset
  2. Push an object, e.g.
    { name: 'John', age: 30 }
    
  3. Push another object with the same keys, but different order, e.g.
    { age: 42, name: 'Adam' }
    
  4. Export the dataset to a CSV file
  5. Observe that the output is
    name,age
    John,30
    42,Adam
    

I would've expected the order of keys->columns to be determined by the first object in the dataset, e.g.

name,age
John,30
Adam,42

Or otherwise, the importance of key order should be noted somewhere in the docs.

Code sample

import { Dataset } from 'crawlee';

const main = async () => {
  const datasetName = 'example-dataset';

  let dataset = await Dataset.open(datasetName);
  await dataset.drop();

  dataset = await Dataset.open(datasetName);

  await dataset.pushData({ name: 'John', age: 30 });
  await dataset.pushData({ age: 42, name: 'Adam' });

  await dataset.exportToCSV(`${datasetName}.csv`);
  console.log((await dataset.getData()).items);
};

void main();

Package version

3.11.5

Node.js version

v20.17.0

Operating system

MacOS, Linux (Arch)

Apify platform

  • [ ] Tick me if you encountered this issue on the Apify platform

I have tested this on the next release

No response

Other context

image

nktnet1 avatar Oct 18 '24 05:10 nktnet1