NVTabular icon indicating copy to clipboard operation
NVTabular copied to clipboard

[BUG] NVTabular workflow raised an error when serving the Pytorch T4Rec example

Open sararb opened this issue 3 years ago • 2 comments

Describe the bug

I tested the end-to-end example from the Transformers4Rec repo ( link to the notebook). But I am getting an error from the NVTabular workflow related to the Normalize op. Below is the detailed error stack:

---------------------------------------------------------------------------
InferenceServerException                  Traceback (most recent call last)
Input In [18], in <cell line: 14>()
     12 MODEL_NAME_NVT = "t4r_pytorch"
     14 with grpcclient.InferenceServerClient("localhost:8001") as client:
---> 15     response = client.infer(MODEL_NAME_NVT, inputs)
     16     print(col, ':\n', response.as_numpy(col))

File /usr/local/lib/python3.8/dist-packages/tritonclient/grpc/__init__.py:1295, in InferenceServerClient.infer(self, model_name, inputs, model_version, outputs, request_id, sequence_id, sequence_start, sequence_end, priority, timeout, client_timeout, headers, compression_algorithm)
   1293     return result
   1294 except grpc.RpcError as rpc_error:
-> 1295     raise_error_grpc(rpc_error)

File /usr/local/lib/python3.8/dist-packages/tritonclient/grpc/__init__.py:62, in raise_error_grpc(rpc_error)
     61 def raise_error_grpc(rpc_error):
---> 62     raise get_error_grpc(rpc_error) from None

InferenceServerException: [StatusCode.INTERNAL] in ensemble 't4r_pytorch', Failed to process the request(s) for model instance 't4r_pytorch_nvt', message: AttributeError: 'Normalize' object has no attribute 'out_dtype'

At:
  /usr/local/lib/python3.8/dist-packages/nvtabular/ops/normalize.py(112): output_dtype
  /usr/local/lib/python3.8/dist-packages/nvtabular/ops/normalize.py(85): transform
  /usr/local/lib/python3.8/dist-packages/nvtx/nvtx.py(101): inner
  /usr/local/lib/python3.8/dist-packages/nvtabular/inference/workflow/base.py(192): _transform_tensors
  /usr/local/lib/python3.8/dist-packages/nvtabular/inference/workflow/base.py(134): _transform_tensors
  /usr/local/lib/python3.8/dist-packages/nvtabular/inference/workflow/base.py(134): _transform_tensors
  /usr/local/lib/python3.8/dist-packages/nvtabular/inference/workflow/base.py(134): _transform_tensors
  /usr/local/lib/python3.8/dist-packages/nvtabular/inference/workflow/base.py(134): _transform_tensors
  /usr/local/lib/python3.8/dist-packages/nvtabular/inference/workflow/base.py(134): _transform_tensors
  /usr/local/lib/python3.8/dist-packages/nvtabular/inference/workflow/base.py(134): _transform_tensors
  /usr/local/lib/python3.8/dist-packages/nvtabular/inference/workflow/base.py(134): _transform_tensors
  /usr/local/lib/python3.8/dist-packages/nvtabular/inference/workflow/base.py(107): run_workflow
  /workspace/TF4Rec/models/t4r_pytorch_nvt/1/model.py(120): execute

Steps/Code to reproduce bug

  • Pull the PyTorch inference container
  • Pull the latest main of Transformers4Rec and NVTabular
  • Run the notebook 01-ETL-with-NVTabular.ipynb to generate data
  • Run the notebook 02-End-to-end-session-based-with-Yoochoose-PyT.ipynb

Expected behavior

  • Serve the NVTabular workflow without error

sararb avatar Jul 15 '22 15:07 sararb

@sararb , please triage this bug

viswa-nvidia avatar Jul 15 '22 18:07 viswa-nvidia

I wasn't able to replicate this one, not sure if it's been fixed or maybe was due to environment issues of some kind.

karlhigley avatar Jul 20 '22 20:07 karlhigley

@sararb , should we continue to track this bug ? looks like Karl couldn't replicate it.

viswa-nvidia avatar Aug 11 '22 00:08 viswa-nvidia

I think this has been addressed by recent-ish changes in NVT

karlhigley avatar Aug 11 '22 01:08 karlhigley