NVTabular
NVTabular copied to clipboard
[BUG] WDL training notebook for HugeCTR processing workflow fails with TypeError
Describe the bug
I was running this notebook which uses NVTabular to process the clicks dataset link commonly used as a demo for HugeCTR. When running this notebook, I keep running into a TypeError
. Given my knowledge of NVTabular's codebase this has been difficult to debug.
Trackback:
2022-09-14 22:01:31,504 NVTabular processing
2022-09-14 22:01:32,957 - distributed.preloading - INFO - Import preload module: dask_cuda.initialize
Traceback (most recent call last):
File "./preprocess.py", line 418, in <module>
process_NVT(args)
File "./preprocess.py", line 176, in process_NVT
(this line)
features += col0 >> FeatureCross(col1) >> Rename(postfix="_"+col1) >> cross_cat_op
File "/usr/local/lib/python3.8/dist-packages/merlin/dag/base_operator.py", line 233, in __rrshift__
return ColumnSelector(other) >> self
File "/usr/local/lib/python3.8/dist-packages/merlin/dag/selector.py", line 128, in __rshift__
return operator.create_node(self) >> operator
File "/usr/local/lib/python3.8/dist-packages/nvtabular/workflow/node.py", line 30, in __rshift__
return super().__rshift__(operator)
File "/usr/local/lib/python3.8/dist-packages/merlin/dag/node.py", line 262, in __rshift__
child.add_dependency(dependency)
File "/usr/local/lib/python3.8/dist-packages/merlin/dag/node.py", line 80, in add_dependency
dep_node = Node.construct_from(dep)
File "/usr/local/lib/python3.8/dist-packages/merlin/dag/node.py", line 497, in construct_from
raise TypeError(
TypeError: Unsupported type: Cannot convert object of type <class 'method'> to Node.
I've narrowed this down to the FeatureCross
class which is implemented as a child class of Operator
.
FeatureCross
Implementation (from notebook)
class FeatureCross(Operator):
def __init__(self, dependency):
self.dependency = dependency
def transform(self, columns, gdf):
new_df = type(gdf)()
for col in columns.names:
new_df[col] = gdf[col] + gdf[self.dependency]
return new_df
def dependencies(self):
return [self.dependency]
It fails on the Node.contruct_from
method which seems to expect either a List
, Str
, or ColumnSelector
which makes intuitive sense, but I don't see how the FeatureCross
implementation would ever raise anything but a typeerror since it's none of those types.
It has a property of a ColumnSelector
but is not one itself (I believe).
Steps/Code to reproduce bug Run the linked notebook above within the environment specified below. (I changed very little besides paths to data)
The notebook in the triton HugeCTR backend repo and the HugeCTR repo both fail with the same error here.
Expected behavior For this notebook to run without error given the environment provided below.
Environment details (please complete the following information):
- Environment location: [Bare-metal, Docker, Cloud(specify cloud provider)]: Docker container
- Container: nvcr.io/nvidia/merlin/merlin-hugectr
- Container Version: :22.07
- Docker Version: 20.10
- Method of NVTabular install: [conda, Docker, or from source]: Docker
- If method of install is [Docker], provide
docker pull
&docker run
commands used
- If method of install is [Docker], provide
Docker steps
# start container (this pulls it too)
sudo docker run -it --name merlin-hugectr-2 --gpus=all --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -v ${PWD}:/models -v ${PWD}:/data/ -w /data/ -p 8888:8888 -p 8000:8000 -p 8001:8001 -p 8002:8002 nvcr.io/nvidia/merlin/merlin-hugectr:22.07
# start jupyterlab
jupyter lab --no-browser --allow-root --ip 0.0.0.0 --port 8888 --NotebookApp.token='hugectr'
Additional context Pinging @EvenOldridge who request this to be placed here.