Fix some errors when executing dumps
Hello!
I'm trying to do a dump of an instance but the package is throwing some errors. This PR is to fix whatever is appearing.
Problems when logging errors
TODO: See #209
KeyError: 'format'
Traceback (most recent call last):
File "/home/pdelboca/Repos/ckanapi/.venv/bin/ckanapi", line 33, in <module>
sys.exit(load_entry_point('ckanapi', 'console_scripts', 'ckanapi')())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/pdelboca/Repos/ckanapi/ckanapi/cli/main.py", line 156, in main
return dump_things(ckan, thing[0], arguments)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/pdelboca/Repos/ckanapi/ckanapi/cli/dump.py", line 110, in dump_things
create_datapackage(record, datapackages_path, stderr, apikey)
File "/home/pdelboca/Repos/ckanapi/ckanapi/datapackage.py", line 67, in create_datapackage
filename = resource_filename(dres)
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/pdelboca/Repos/ckanapi/ckanapi/datapackage.py", line 87, in resource_filename
ext = slugify.slugify(dres['format'])
~~~~^^^^^^^^^^
KeyError: 'format'
@wardi have you ever used ckanapi to do a dump of a portal? I'm trying to do a dump of https://datos.gob.ar/ but it is extremely slow and it also gets "blocked" after 250 datasets. (Blocked = doesnt write any output, no progress, nothing is happening)
I'm trying to do:
ckanapi dump datasets --all --datapackages=./output_directory/ -r https://datos.gob.ar
@pdelboca we use it daily to create a history of our metadata for ~30k datasets. It's possible you're being throttled on the server side. dump datasets makes a separate package_show query for every dataset, you could try using search datasets instead that paginates over package_search instead for fewer requests.
It's possible to resume an interrupted load but not the dump command at the moment, maybe that's needed if you are being throttled.