pandas icon indicating copy to clipboard operation
pandas copied to clipboard

BUG: pandas.to_datetime fails to handle numpy.nan on riscv64 due to dependency on undefined behaviour

Open andreas-schwab opened this issue 2 years ago • 27 comments

Pandas version checks

  • [X] I have checked that this issue has not already been reported.

  • [X] I have confirmed this bug exists on the latest version of pandas.

  • [ ] I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas
import numpy
print(pandas.to_datetime(numpy.nan, unit="s"))

Issue Description

Converting a value of floating type to integer type which is out of range for the integer type is undefined, see 6.3.1.4 Real floating and integer.

You can use gcc92.fsffrance.org for your tests if don't have your own hardware, or use qemu with the images from https://download.opensuse.org/ports/riscv/tumbleweed/images/.

$ python3 -c 'import pandas
import numpy
print(pandas.to_datetime(numpy.nan, unit="s"))'
Traceback (most recent call last):
  File "<string>", line 3, in <module>
  File "/usr/lib64/python3.10/site-packages/pandas/core/tools/datetimes.py", line 1078, in to_datetime
    result = convert_listlike(np.array([arg]), format)[0]
  File "/usr/lib64/python3.10/site-packages/pandas/core/tools/datetimes.py", line 357, in _convert_listlike_datetimes
    return _to_datetime_with_unit(arg, unit, name, tz, errors)
  File "/usr/lib64/python3.10/site-packages/pandas/core/tools/datetimes.py", line 530, in _to_datetime_with_unit
    arr, tz_parsed = tslib.array_with_unit_to_datetime(arg, unit, errors=errors)
  File "pandas/_libs/tslib.pyx", line 266, in pandas._libs.tslib.array_with_unit_to_datetime
pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: cannot convert input with unit 's'

Expected Behavior

No error.

Installed Versions

/usr/lib/python3.10/site-packages/_distutils_hack/init.py:33: UserWarning: Setuptools is replacing distutils. warnings.warn("Setuptools is replacing distutils.")

INSTALLED VERSIONS

commit : e8093ba372f9adfe79439d90fe74b0b5b6dea9d6 python : 3.10.6.final.0 python-bits : 64 OS : Linux OS-release : 6.0.0-rc5-38-default Version : #1 SMP Mon Sep 12 15:18:20 UTC 2022 (005845a) machine : riscv64 processor : riscv64 byteorder : little LC_ALL : None LANG : de_DE.UTF-8 LOCALE : de_DE.UTF-8

pandas : 1.4.3 numpy : 1.21.6 pytz : 2022.1 dateutil : 2.8.2 setuptools : 63.2.0 pip : 22.0.4 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : None IPython : None pandas_datareader: None bs4 : None bottleneck : 1.3.5 brotli : 1.0.9 fastparquet : None fsspec : None gcsfs : None markupsafe : None matplotlib : None numba : None numexpr : 2.8.3 odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pyreadstat : None pyxlsb : None s3fs : None scipy : None snappy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlwt : None zstandard : None

andreas-schwab avatar Sep 17 '22 18:09 andreas-schwab

Hi, thanks for your report. This works on 1.4.3 and main for me. Can you please recheck, that you have pandas 1.4.3 installed?

phofl avatar Sep 17 '22 19:09 phofl

Of course I have, why do you ask?

andreas-schwab avatar Sep 17 '22 20:09 andreas-schwab

Because it works for me on 1.4.3, 1.4.4 and main

phofl avatar Sep 17 '22 20:09 phofl

How did you test that?

andreas-schwab avatar Sep 17 '22 20:09 andreas-schwab

I executed the code snippet from your post?

phofl avatar Sep 17 '22 20:09 phofl

Where did you execute it? In qemu or on real hardware?

andreas-schwab avatar Sep 17 '22 20:09 andreas-schwab

Macos

phofl avatar Sep 17 '22 20:09 phofl

I executed it on a macOS os

phofl avatar Sep 17 '22 20:09 phofl

What hardware?

andreas-schwab avatar Sep 17 '22 20:09 andreas-schwab

There is macos for RISC-V???

andreas-schwab avatar Sep 17 '22 20:09 andreas-schwab

Did you actually read the report?

andreas-schwab avatar Sep 17 '22 20:09 andreas-schwab

Please use gcc92.fsffrance.org for your tests if don't have your own hardware.

andreas-schwab avatar Sep 17 '22 20:09 andreas-schwab

There is a code snippet, nothing else. I read the versions, but if you think that this is specific to your hardware, a small explanation would have been nice. Your report reads like a general issue, which is not the case.

could you adjust your issue title and add a small explanation?

you could also try debugging it yourself if you are interested

phofl avatar Sep 17 '22 20:09 phofl

You can also use one of the images in https://download.opensuse.org/ports/riscv/tumbleweed/images/ with qemu.

andreas-schwab avatar Sep 17 '22 20:09 andreas-schwab

This is a general issue because it depends on undefined behaviour (converting NaN value to integer).

andreas-schwab avatar Sep 17 '22 20:09 andreas-schwab

It works on my machine, so seems to be hardware dependent

phofl avatar Sep 17 '22 20:09 phofl

Depending on undefined behaviour is a bug.

andreas-schwab avatar Sep 17 '22 20:09 andreas-schwab

$ python3 -c $'import numpy\nprint(numpy.asarray(numpy.nan).astype("i8"))' 9223372036854775807

andreas-schwab avatar Sep 17 '22 20:09 andreas-schwab

@andreas-schwab we don't support this hardware in any way

you can submit a patch if u can find the problem

jreback avatar Sep 17 '22 20:09 jreback

This has nothing to do with hardware support. This is undefined behaviour. Depending on undefined behaviour is a serious bug.

andreas-schwab avatar Sep 17 '22 20:09 andreas-schwab

You are welcome to submit a pr, if you can identify the bug and provide a fix.

phofl avatar Sep 17 '22 20:09 phofl

@andreas-schwab could you please clarify? What commits is this diff between, what's it meant to show? I've formatted your code to make it easier to read, but - apologies for not understanding - I still don't see your point. Could you clarify what exactly you're expecting pandas to do?

Did you actually read the report?

please be respectful

MarcoGorelli avatar Sep 18 '22 09:09 MarcoGorelli

Your reaction to the bug report has been far from respectful so far.

andreas-schwab avatar Sep 18 '22 09:09 andreas-schwab

Could you please update the top post with steps how to reproduce the bug if you are on windows/ubuntu/macOS? This will help someone who wants to work on this. We have tests covering this case in the ci, so simply executing the code snippet won't be sufficient.

Additionally, it would be great if you could add an explanation on what you are referring to with undefined behaviour, this is not clear to me. Some context to the code snippet you posted earlier would also be helpful.

phofl avatar Sep 18 '22 11:09 phofl

Converting a value of floating type to integer type which is out of range for the integer type is undefined, see 6.3.1.4 Real floating and integer.

andreas-schwab avatar Sep 18 '22 11:09 andreas-schwab

You can use gcc92.fsffrance.org for your tests if don't have your own hardware, or use qemu with the images from https://download.opensuse.org/ports/riscv/tumbleweed/images/.

andreas-schwab avatar Sep 18 '22 11:09 andreas-schwab

I copied it into the top post in case someone wants to work on this

phofl avatar Sep 18 '22 12:09 phofl