git_diff_xlsx
git_diff_xlsx copied to clipboard
Converts a Microsoft Excel 2007+ file into plain text for comparison using git diff
git_diff_xlsx 
This script parses an .xlsx file and converts it into a text format which can be compared using git.
I wrote this script as I wanted to use a version control package for managing an existing computational model, the input files of which are defined using Microsoft Excel workbooks.
See my blog entry for more details of how it works.
Installation
- Download the latest release
- Run
python setup.py install - Add the following lines to the global .gitconfig file:
[diff "git_diff_xlsx"]
binary = True
textconv = parse_xlsx
cachetextconv = true
- Add the following line to your repository's
.gitattributesfile*.xlsx diff=git_diff_xlsx - Now, typing
git diffat the prompt will produce differences between text versions of Excel.xlsxfiles
Caveats
There are a bunch of issues with this script. I wrote it to fulfil a need I had then and there and there are lots of hard-coded horrors. Please feel free to contribute to cleaning up the code, submit issues and pull-requests.