data.world-py icon indicating copy to clipboard operation
data.world-py copied to clipboard

`load_dataset` fails to run due to nanosecond timestamps in API response

Open nachomaiz opened this issue 1 year ago • 2 comments

Hi,

I'm getting an error when using the load_dataset function. It seems that the API is providing datetime information with nanosecond resolution, while datetime only supports up to microsecond resolution:

Traceback (most recent call last):
  File "~\t2.py", line 3, in <module>
    dw_ds = dw.load_dataset("{owner}/{id}")  # modified for privacy
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~\venv\Lib\site-packages\datadotworld\__init__.py", line 99, in load_dataset
    load_dataset(dataset_key,
  File "~\venv\Lib\site-packages\datadotworld\datadotworld.py", line 164, in load_dataset
    last_modified = datetime.strptime(dataset_info['updated'],
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~\Miniconda3\envs\nox\Lib\_strptime.py", line 568, in _strptime_datetime
    tt, fraction, gmtoff_fraction = _strptime(data_string, format)
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~\Miniconda3\envs\nox\Lib\_strptime.py", line 349, in _strptime
    raise ValueError("time data %r does not match format %r" %
ValueError: time data '2023-03-22T17:14:38.878483744Z' does not match format '%Y-%m-%dT%H:%M:%S.%fZ'

This should work if nanoseconds are stripped from the string before parsing with datetime.

datetime.datetime.strptime("2023-03-22T17:14:38.878483Z")  # works

Let me know if I can provide any more info.

Happy to contribute a PR if you would like.

Thanks!

nachomaiz avatar Apr 05 '23 18:04 nachomaiz

I've also just hit this, as a workaround you can pass force_update=True to bypass the last_modified check.

alexcrawley avatar Oct 25 '23 12:10 alexcrawley

I've also just hit this, as a workaround you can pass force_update=True to bypass the last_modified check.

Oh, good tip! Will try that instead. Thanks @alexcrawley!

Still, I think it should still be addressed within the package if the purpose is to store cached requests.

nachomaiz avatar Oct 27 '23 11:10 nachomaiz