ckanext-datastorer
ckanext-datastorer copied to clipboard
datastore_upload paster command error on SA
Hey, the datastorer_upload script is crashing on SA with the traceback below.
I think this is the Excel file that it crashes while trying to parse:
http://data.sa.gov.au/dataset/liquor-gaming-licences/resource/51139b10-7835-41ea-b6ec-d29964d619cd
It'd be good if it could be fixed to parse this file but the main thing is: does crashing on this file mean the script never gets to the other resource files that come after it? Or does it continue?
Traceback (most recent call last):
File "/usr/lib/ckan/sa/bin/paster", line 9, in <module>
load_entry_point('PasteScript==1.7.5', 'console_scripts', 'paster')()
File "/usr/lib/ckan/sa/local/lib/python2.7/site-packages/paste/script/command.py", line 104, in run
invoke(command, command_name, options, args[1:])
File "/usr/lib/ckan/sa/local/lib/python2.7/site-packages/paste/script/command.py", line 143, in invoke
exit_code = runner.run(args)
File "/usr/lib/ckan/sa/local/lib/python2.7/site-packages/paste/script/command.py", line 238, in run
result = self.command()
File "/usr/lib/ckan/sa/src/ckanext-datastorer/ckanext/datastorer/commands.py", line 244, in command
status = self.push_to_datastore(context, resource)
File "/usr/lib/ckan/sa/src/ckanext-datastorer/ckanext/datastorer/commands.py", line 302, in push_to_datastore
offset, headers = headers_guess(row_set.sample)
File "/usr/lib/ckan/sa/local/lib/python2.7/site-packages/messytables/headers.py", line 29, in headers_guess
rows = list(rows)
File "/usr/lib/ckan/sa/local/lib/python2.7/site-packages/messytables/core.py", line 177, in __iter__
for row in self.raw(sample=sample):
File "/usr/lib/ckan/sa/local/lib/python2.7/site-packages/messytables/excel.py", line 77, in raw
xlrd.xldate_as_tuple(value, self.sheet.book.datemode)
File "/usr/lib/ckan/sa/local/lib/python2.7/site-packages/xlrd/xldate.py", line 60, in xldate_as_tuple
raise XLDateNegative(xldate)
xlrd.xldate.XLDateNegative: -12943.0
@seanh Did this get resolved? The error that you are seeing appears to be from this line [1], which isn't currently being caught and so will prevent the rest of the datasets from being processed.
[1] https://github.com/okfn/ckanext-datastorer/blob/master/ckanext/datastorer/commands.py#L304
Afaik this never got resolved and the datastorer script is still crashing on SA (and some other sites too I think). I just logged-in and ran the datastorer cron job on SA, still seems to crash (even after I pulled the latest ckanext-datastorer master branch commits)