Composite key
This looks like it will be really useful, thanks.
When we were working on CSV Schema Language we found it necessary to allow uniqueness to be defined over a composite set of columns (the unique column rule in the schema). I can see from the code structure that this wouldn't necessarily be entirely straightforward here, but I think it would be useful.
This doesn't seem impossibly difficult to add... it could work by allowing users to specify the --key option multiple times
$ csv-diff one.csv two.csv --key=id --key=secondary
Part of the work would be teaching the CSV loading function to work with compound keys and create the internal ID as a tuple of values:
https://github.com/simonw/csv-diff/blob/825a28ccfdc20d011373b57b264970113df64872/csv_diff/init.py#L10-L15
The human_text() function would then need to learn how to display a compound ID.
Thanks Simon,
I must admit that I was forgetting that our CSVs do typically have a URI per row too which is unique, so we could use that for purposes of getting a diff. May still be useful for others though.
For human_text(), perhaps some way of passing in a formatting string? In Python terms we'd want something likef"{r['lettercode']} {r['series']}/{r['piece']}/{r['item']} image {r['ordinal']}"
Seconding this, I'd find this very useful. As best as I can find, there are no other similar libraries that allow for composite keys, but I already use and am very happy with this package.
Hi @simonw I've made progress on this feature and would like to share. Would you grant me write access?
@puddleoasis why not submit a PR?