data.world-py
data.world-py copied to clipboard
`load_dataset` fails to run due to nanosecond timestamps in API response
Hi,
I'm getting an error when using the load_dataset
function. It seems that the API is providing datetime information with nanosecond resolution, while datetime
only supports up to microsecond resolution:
Traceback (most recent call last):
File "~\t2.py", line 3, in <module>
dw_ds = dw.load_dataset("{owner}/{id}") # modified for privacy
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "~\venv\Lib\site-packages\datadotworld\__init__.py", line 99, in load_dataset
load_dataset(dataset_key,
File "~\venv\Lib\site-packages\datadotworld\datadotworld.py", line 164, in load_dataset
last_modified = datetime.strptime(dataset_info['updated'],
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "~\Miniconda3\envs\nox\Lib\_strptime.py", line 568, in _strptime_datetime
tt, fraction, gmtoff_fraction = _strptime(data_string, format)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "~\Miniconda3\envs\nox\Lib\_strptime.py", line 349, in _strptime
raise ValueError("time data %r does not match format %r" %
ValueError: time data '2023-03-22T17:14:38.878483744Z' does not match format '%Y-%m-%dT%H:%M:%S.%fZ'
This should work if nanoseconds are stripped from the string before parsing with datetime.
datetime.datetime.strptime("2023-03-22T17:14:38.878483Z") # works
Let me know if I can provide any more info.
Happy to contribute a PR if you would like.
Thanks!
I've also just hit this, as a workaround you can pass force_update=True
to bypass the last_modified
check.
I've also just hit this, as a workaround you can pass
force_update=True
to bypass thelast_modified
check.
Oh, good tip! Will try that instead. Thanks @alexcrawley!
Still, I think it should still be addressed within the package if the purpose is to store cached requests.