NVTabular icon indicating copy to clipboard operation
NVTabular copied to clipboard

[BUG] Dtype discrepancy with pandas and groupby on CPU

Open oliverholworthy opened this issue 2 years ago • 1 comments

Describe the bug

Steps/Code to reproduce bug

  • Run notebook https://github.com/NVIDIA-Merlin/Transformers4Rec/blob/v23.02.00/examples/getting-started-session-based/01-ETL-with-NVTabular.ipynb
    • In CPU-only environmennt
TypeError: Dtype discrepancy detected for column age_days-list: operator Groupby reported dtype `DType(name='float32', element_type=<ElementType.Float: 'float'>, element_size=32, element_unit=None, signed=True, shape=Shape(dims=None))` but returned dtype `DType(name='float64', element_type=<ElementType.Float: 'float'>, element_size=64, element_unit=None, signed=True, shape=Shape(dims=None))`.

Expected behavior

No exception raised, and output matching equivalent result when running on GPU with cudf

Environment details:

  • Environment location: Docker
  • Method of NVTabular install: from source

Additional context

A similar issue has been reported recently #1767 . However that particular example is now working following a change in core https://github.com/NVIDIA-Merlin/core/pull/226

oliverholworthy avatar Mar 13 '23 09:03 oliverholworthy

@oliverholworthy I ran on the 23.04 pytorch container without GPU and it ran without error. Is this error only apparent when installing NVTabular from source? Or was it corrected with changes in core also?

angmc avatar May 19 '23 14:05 angmc