[BUG] NVTabular workflow raised an error when serving the Pytorch T4Rec example
Describe the bug
I tested the end-to-end example from the Transformers4Rec repo ( link to the notebook). But I am getting an error from the NVTabular workflow related to the Normalize op. Below is the detailed error stack:
---------------------------------------------------------------------------
InferenceServerException Traceback (most recent call last)
Input In [18], in <cell line: 14>()
12 MODEL_NAME_NVT = "t4r_pytorch"
14 with grpcclient.InferenceServerClient("localhost:8001") as client:
---> 15 response = client.infer(MODEL_NAME_NVT, inputs)
16 print(col, ':\n', response.as_numpy(col))
File /usr/local/lib/python3.8/dist-packages/tritonclient/grpc/__init__.py:1295, in InferenceServerClient.infer(self, model_name, inputs, model_version, outputs, request_id, sequence_id, sequence_start, sequence_end, priority, timeout, client_timeout, headers, compression_algorithm)
1293 return result
1294 except grpc.RpcError as rpc_error:
-> 1295 raise_error_grpc(rpc_error)
File /usr/local/lib/python3.8/dist-packages/tritonclient/grpc/__init__.py:62, in raise_error_grpc(rpc_error)
61 def raise_error_grpc(rpc_error):
---> 62 raise get_error_grpc(rpc_error) from None
InferenceServerException: [StatusCode.INTERNAL] in ensemble 't4r_pytorch', Failed to process the request(s) for model instance 't4r_pytorch_nvt', message: AttributeError: 'Normalize' object has no attribute 'out_dtype'
At:
/usr/local/lib/python3.8/dist-packages/nvtabular/ops/normalize.py(112): output_dtype
/usr/local/lib/python3.8/dist-packages/nvtabular/ops/normalize.py(85): transform
/usr/local/lib/python3.8/dist-packages/nvtx/nvtx.py(101): inner
/usr/local/lib/python3.8/dist-packages/nvtabular/inference/workflow/base.py(192): _transform_tensors
/usr/local/lib/python3.8/dist-packages/nvtabular/inference/workflow/base.py(134): _transform_tensors
/usr/local/lib/python3.8/dist-packages/nvtabular/inference/workflow/base.py(134): _transform_tensors
/usr/local/lib/python3.8/dist-packages/nvtabular/inference/workflow/base.py(134): _transform_tensors
/usr/local/lib/python3.8/dist-packages/nvtabular/inference/workflow/base.py(134): _transform_tensors
/usr/local/lib/python3.8/dist-packages/nvtabular/inference/workflow/base.py(134): _transform_tensors
/usr/local/lib/python3.8/dist-packages/nvtabular/inference/workflow/base.py(134): _transform_tensors
/usr/local/lib/python3.8/dist-packages/nvtabular/inference/workflow/base.py(134): _transform_tensors
/usr/local/lib/python3.8/dist-packages/nvtabular/inference/workflow/base.py(107): run_workflow
/workspace/TF4Rec/models/t4r_pytorch_nvt/1/model.py(120): execute
Steps/Code to reproduce bug
- Pull the PyTorch inference container
- Pull the latest main of Transformers4Rec and NVTabular
- Run the notebook 01-ETL-with-NVTabular.ipynb to generate data
- Run the notebook 02-End-to-end-session-based-with-Yoochoose-PyT.ipynb
Expected behavior
- Serve the NVTabular workflow without error
@sararb , please triage this bug
I wasn't able to replicate this one, not sure if it's been fixed or maybe was due to environment issues of some kind.
@sararb , should we continue to track this bug ? looks like Karl couldn't replicate it.
I think this has been addressed by recent-ish changes in NVT