tablib icon indicating copy to clipboard operation
tablib copied to clipboard

library re-orders columns when exporting content in YAML

Open PauloPhagula opened this issue 8 years ago • 8 comments

Hello,

I've been using tablib indirectly through records to produce some Excel reports from a database, and I've noticed that while I add my content in a specific order, when exporting, the order of the columns is not the same as I added.

Looking at the documentation of tablib -- if the examples are not merely illustrative -- this should be noticeable, as for instance, when exporting to JSON, YAML, and CSV, the output "columns" are re-ordered differently for each format:

>>> print data.json
[
  {
    "last_name": "Adams",
    "age": 90,
    "first_name": "John"
  },
  {
    "last_name": "Ford",
    "age": 83,
    "first_name": "Henry"
  }
]

>>> print data.yaml
- {age: 90, first_name: John, last_name: Adams}
- {age: 83, first_name: Henry, last_name: Ford}

>>> print data.csv
first_name,last_name,age
John,Adams,90
Henry,Ford,83

I'd appreciate if this could be fixed in such way that output would always have the same structure as what was inputted.

Thanks

PauloPhagula avatar Apr 22 '17 17:04 PauloPhagula

I'm also facing this issue. I'm using a list to explicitly set headers, then I import a list of dicts. column ordering is broken after I export to xlsx, header names are also forgotten.

Cediddi avatar May 31 '17 12:05 Cediddi

this should not be the case!

kennethreitz avatar Jun 04 '17 15:06 kennethreitz

I guess when I import dict, headers got overriden and thus randomized. As a workaround I changed dict to flat tuples. This way there's no keys in the data and it's perfectly ordered. I can provide a minimal example for both.

Cediddi avatar Jun 04 '17 16:06 Cediddi

great!

kennethreitz avatar Jun 05 '17 16:06 kennethreitz

I couldn't reproduce using these examples from the README:

Python 3.8.1 (v3.8.1:1b293b6006, Dec 18 2019, 14:08:53)
[Clang 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import tablib
>>> tablib.__version__
'1.0.0'
>>> data = tablib.Dataset()
>>> names = ['Kenneth Reitz', 'Bessie Monke']
>>>
>>> for name in names:
...     fname, lname = name.split()
...     data.append([fname, lname])
...
>>> data.dict
[['Kenneth', 'Reitz'], ['Bessie', 'Monke']]
>>> data.headers = ['First Name', 'Last Name']
>>> data.dict
[OrderedDict([('First Name', 'Kenneth'), ('Last Name', 'Reitz')]), OrderedDict([('First Name', 'Bessie'), ('Last Name', 'Monke')])]
>>> data.append_col([22, 20], header='Age')
>>> data.dict
[OrderedDict([('First Name', 'Kenneth'), ('Last Name', 'Reitz'), ('Age', 22)]), OrderedDict([('First Name', 'Bessie'), ('Last Name', 'Monke'), ('Age', 20)])]
>>> data.export('csv')
'First Name,Last Name,Age\r\nKenneth,Reitz,22\r\nBessie,Monke,20\r\n'
>>> data.export('json')
'[{"First Name": "Kenneth", "Last Name": "Reitz", "Age": 22}, {"First Name": "Bessie", "Last Name": "Monke", "Age": 20}]'
>>> data.export('yaml')
'- {Age: 22, First Name: Kenneth, Last Name: Reitz}\n- {Age: 20, First Name: Bessie, Last Name: Monke}\n'

It might have been a Python 2-only problem.

If it's still a problem with Python 3, please include a reproducible snippet of code, along with the Python and tablib versions.

hugovk avatar Feb 12 '20 13:02 hugovk

Please have a look at the image below with a snippet from the example on the docs.

Mind how the order for json and yaml is different.

Also note that this is while using Python 3.

Screenshot from 2020-02-12 17-34-01

PauloPhagula avatar Feb 12 '20 15:02 PauloPhagula

Thanks, now I see, the age column is in a different place with YAML (they're in alphabetical order).

hugovk avatar Feb 12 '20 16:02 hugovk

This is specific to YAML. Traditionally, PyYAML was always sorting keys alphabetically. Only recently (https://github.com/yaml/pyyaml/pull/254, committed March 2019) did PyYAML offer the ability to opt out this key sorting. So now we should be able to add the sort_keys=False parameter to our usage of yaml.safe_dump, at the condition we also require PyYAML >= 5.1.

claudep avatar Feb 12 '20 18:02 claudep