tablib
tablib copied to clipboard
library re-orders columns when exporting content in YAML
Hello,
I've been using tablib indirectly through records to produce some Excel reports from a database, and I've noticed that while I add my content in a specific order, when exporting, the order of the columns is not the same as I added.
Looking at the documentation of tablib -- if the examples are not merely illustrative -- this should be noticeable, as for instance, when exporting to JSON, YAML, and CSV, the output "columns" are re-ordered differently for each format:
>>> print data.json
[
{
"last_name": "Adams",
"age": 90,
"first_name": "John"
},
{
"last_name": "Ford",
"age": 83,
"first_name": "Henry"
}
]
>>> print data.yaml
- {age: 90, first_name: John, last_name: Adams}
- {age: 83, first_name: Henry, last_name: Ford}
>>> print data.csv
first_name,last_name,age
John,Adams,90
Henry,Ford,83
I'd appreciate if this could be fixed in such way that output would always have the same structure as what was inputted.
Thanks
I'm also facing this issue. I'm using a list to explicitly set headers, then I import a list of dicts. column ordering is broken after I export to xlsx, header names are also forgotten.
this should not be the case!
I guess when I import dict, headers got overriden and thus randomized. As a workaround I changed dict to flat tuples. This way there's no keys in the data and it's perfectly ordered. I can provide a minimal example for both.
great!
I couldn't reproduce using these examples from the README:
Python 3.8.1 (v3.8.1:1b293b6006, Dec 18 2019, 14:08:53)
[Clang 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import tablib
>>> tablib.__version__
'1.0.0'
>>> data = tablib.Dataset()
>>> names = ['Kenneth Reitz', 'Bessie Monke']
>>>
>>> for name in names:
... fname, lname = name.split()
... data.append([fname, lname])
...
>>> data.dict
[['Kenneth', 'Reitz'], ['Bessie', 'Monke']]
>>> data.headers = ['First Name', 'Last Name']
>>> data.dict
[OrderedDict([('First Name', 'Kenneth'), ('Last Name', 'Reitz')]), OrderedDict([('First Name', 'Bessie'), ('Last Name', 'Monke')])]
>>> data.append_col([22, 20], header='Age')
>>> data.dict
[OrderedDict([('First Name', 'Kenneth'), ('Last Name', 'Reitz'), ('Age', 22)]), OrderedDict([('First Name', 'Bessie'), ('Last Name', 'Monke'), ('Age', 20)])]
>>> data.export('csv')
'First Name,Last Name,Age\r\nKenneth,Reitz,22\r\nBessie,Monke,20\r\n'
>>> data.export('json')
'[{"First Name": "Kenneth", "Last Name": "Reitz", "Age": 22}, {"First Name": "Bessie", "Last Name": "Monke", "Age": 20}]'
>>> data.export('yaml')
'- {Age: 22, First Name: Kenneth, Last Name: Reitz}\n- {Age: 20, First Name: Bessie, Last Name: Monke}\n'
It might have been a Python 2-only problem.
If it's still a problem with Python 3, please include a reproducible snippet of code, along with the Python and tablib versions.
Please have a look at the image below with a snippet from the example on the docs.
Mind how the order for json and yaml is different.
Also note that this is while using Python 3.

Thanks, now I see, the age column is in a different place with YAML (they're in alphabetical order).
This is specific to YAML. Traditionally, PyYAML was always sorting keys alphabetically. Only recently (https://github.com/yaml/pyyaml/pull/254, committed March 2019) did PyYAML offer the ability to opt out this key sorting. So now we should be able to add the sort_keys=False parameter to our usage of yaml.safe_dump, at the condition we also require PyYAML >= 5.1.