NVTabular
NVTabular copied to clipboard
[BUG] Unable to access cuDF due to RuntimeError: cuDF failure : Unsupported type_id conversion to cudf
Describe the bug A clear and concise description of what the bug is. I am trying to run the example code at https://nvidia-merlin.github.io/NVTabular/main/api/ops/categorify.html
import cudf
import nvtabular as nvt
# Create toy dataset
df = cudf.DataFrame({
'author': ['User_A', 'User_B', 'User_C', 'User_C', 'User_A', 'User_B', 'User_A'],
'productID': [100, 101, 102, 101, 102, 103, 103],
'label': [0, 0, 1, 1, 1, 0, 0]
}). # ERROR: RuntimeError: cuDF failure at: /opt/rapids/src/cudf/cpp/src/interop/from_arrow.cu:86: Unsupported type_id conversion to cudf
dataset = nvt.Dataset(df)
# Define pipeline
CATEGORICAL_COLUMNS = ['author', 'productID']
cat_features = CATEGORICAL_COLUMNS >> nvt.ops.Categorify(
freq_threshold={"author": 3, "productID": 2},
num_buckets={"author": 10, "productID": 20})
# Initialize the workflow and execute it
proc = nvt.Workflow(cat_features)
proc.fit(dataset)
ddf = proc.transform(dataset).to_ddf()
# Print results
print(ddf.compute())
also, at https://github.com/NVIDIA-Merlin/NVTabular/blob/main/tests/unit/examples/test_02-Advanced-NVTabular-workflow.py I got error for
from merlin.core.compat import cudf
ImportError Traceback (most recent call last)
Cell In[12], line 1
----> 1 from merlin.core.compat import cudf
ImportError: cannot import name 'cudf' from 'merlin.core.compat' (/usr/local/lib/python3.8/dist-packages/merlin/core/compat.py)
Expected behavior It should work well.
Environment details (please complete the following information): Platform: Debian 4.19.269-1 Python version: 3.8.10 PyTorch version (GPU?): 2.0.0 (yes support GPU)
-
Environment location: [Bare-metal, Docker, Cloud(specify cloud provider)] GCP
-
Method of NVTabular install: [conda, Docker, or from source] Docker
-
If method of install is [Docker], provide
docker pull
&docker run
commands used I am using nvcr.io/nvidia/merlin/merlin-pytorch:23.02. All cudf libs were installed by GCP by default.
Additional context
cudf : 22.8.0a0+304.g6ca81bbc78.dirty dask-cudf : 22.8.0a0+304.g6ca81bbc78.dirty
CUDA Version: 11.8 NVIDIA-SMI 510.47.03 Driver Version: 510.47.03
merlin 1.9.1 merlin-core 0.5.0 merlin-dataloader 0.0.3 merlin-models 23.2.0 merlin-systems 23.2.0
nvidia-cublas-cu11 11.10.3.66 nvidia-cuda-cupti-cu11 11.7.101 nvidia-cuda-nvrtc-cu11 11.7.99 nvidia-cuda-runtime-cu11 11.7.99 nvidia-cudnn-cu11 8.5.0.96 nvidia-cufft-cu11 10.9.0.58 nvidia-curand-cu11 10.2.10.91 nvidia-cusolver-cu11 11.4.0.1 nvidia-cusparse-cu11 11.7.4.91 nvidia-nccl-cu11 2.14.3 nvidia-nvtx-cu11 11.7.91 nvidia-pyindex 1.0.9 nvtabular 23.2.0
GPU : Tesla T4 Cuda compilation tools, release 11.8, V11.8.89 Build cuda_11.8.r11.8/compiler.31833905_0
triton 2.0.0 tritonclient 2.32.0
Ubuntu 20.04.5 LTS rmm 22.8.0a0+62.gf6bf047.dirty torch 2.0.0
It looks like you have an older version of merlin-core
. The latest is 23.02.01. Based on when merlin.core.compat
was added, I'm fairly confident installing a newer version of merlin-core
will resolve the cudf
import issue you described.
@karlhigley May I use
FROM nvcr.io/nvidia/merlin/merlin-pytorch:latest
in the docker file so that I can always install the latest one ?
@karlhigley , I got a build error:
FROM nvcr.io/nvidia/merlin/merlin-pytorch:23.02.01. (same error for :latest)
"Containerize the artifact": manifest for nvcr.io/nvidia/merlin/merlin-pytorch:23.02.01 not found: manifest unknown: manifest unknown"
Ah sorry, I meant the latest version of merlin-core
is 23.02.01; there's no 23.02.01 container version. The latest version of the Torch container comes with merlin-core 23.2.0
pre-installed, which should be new enough to avoid the merlin.core.compat
error you mentioned. Since you have merlin-core 0.5.0
, I'm guessing you may have installed one of the Merlin libraries from source, some of which have overly permissive version specifiers and can cause this issue. Using the merlin-pytorch 23.02
container, it should be sufficient to pip install merlin-core
after installing any of the other Merlin libraries from source.
@karlhigley , I am using this for the container image
FROM nvcr.io/nvidia/merlin/merlin-pytorch:nightly
I got:
merlin 1.10.0
merlin-core 23.2.1
merlin-dataloader 23.2.1
merlin-models 23.2.0
merlin-systems 0+untagged.1.ge94d2a9
cuda-python 11.8.1
cudf 22.8.0a0+304.g6ca81bbc78.dirty
cupy-cuda117 10.6.0
When I run
import cudf
# import pandas as pd
print('cuDF Version:', cudf.__version__)
I got:
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[2], line 1
----> 1 import cudf
2 # import pandas as pd
3 print('cuDF Version:', cudf.__version__)
File /usr/local/lib/python3.8/dist-packages/cudf/__init__.py:12
8 from numba import config as numba_config, cuda
10 import rmm
---> 12 from cudf.api.types import dtype
13 from cudf import api, core, datasets, testing
14 from cudf._version import get_versions
File /usr/local/lib/python3.8/dist-packages/cudf/api/__init__.py:3
1 # Copyright (c) 2021, NVIDIA CORPORATION.
----> 3 from cudf.api import extensions, types
5 __all__ = ["extensions", "types"]
File /usr/local/lib/python3.8/dist-packages/cudf/api/types.py:18
15 from pandas.api import types as pd_types
17 import cudf
---> 18 from cudf.core.dtypes import ( # noqa: F401
19 _BaseDtype,
20 dtype,
21 is_categorical_dtype,
22 is_decimal32_dtype,
23 is_decimal64_dtype,
24 is_decimal128_dtype,
25 is_decimal_dtype,
26 is_interval_dtype,
27 is_list_dtype,
28 is_struct_dtype,
29 )
32 def is_numeric_dtype(obj):
33 """Check whether the provided array or dtype is of a numeric dtype.
34
35 Parameters
(...)
43 Whether or not the array or dtype is of a numeric dtype.
44 """
File /usr/local/lib/python3.8/dist-packages/cudf/core/dtypes.py:13
11 from pandas.api import types as pd_types
12 from pandas.api.extensions import ExtensionDtype
---> 13 from pandas.core.arrays._arrow_utils import ArrowIntervalType
14 from pandas.core.dtypes.dtypes import (
15 CategoricalDtype as pd_CategoricalDtype,
16 CategoricalDtypeType as pd_CategoricalDtypeType,
17 )
19 import cudf
ModuleNotFoundError: No module named 'pandas.core.arrays._arrow_utils'
You can build an image that way, but we don't generally guarantee the stability of the nightly images. Are you seeing the same issue building
FROM nvcr.io/nvidia/merlin/merlin-pytorch:23.02
?
@karlhigley , yes, I got the same error for
FROM nvcr.io/nvidia/merlin/merlin-pytorch:23.02
@jperez999 Are there known version incompatibility issues between Pandas and cuDF that might explain this?