systems
systems copied to clipboard
[BUG] QueryFaiss returning item ID as -1 causing type error
Bug description
The Faiss index returns a -1 item_id when it can't fill the TopK results. This causes Feast to return None for item results causing a type error. The -1 output isn't properly filtered in QueryFaiss.
InferenceServerException: [StatusCode.INTERNAL] Traceback (most recent call last):
File "/workspace/examples/Building-and-deploying-multi-stage-RecSys/poc_ensemble/executor_model/1/model.py", line 101, in execute
outputs = self.ensemble.transform(inputs, runtime=TritonExecutorRuntime())
File "/usr/local/lib/python3.10/dist-packages/merlin/systems/dag/ensemble.py", line 78, in transform
return runtime.transform(self.graph, transformable)
File "/usr/local/lib/python3.10/dist-packages/merlin/dag/runtime.py", line 53, in transform
return self.executor.transform(transformable, [graph.output_node])
File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 102, in transform
transformed_data = self._execute_node(node, transformable, capture_dtypes, strict)
File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 116, in _execute_node
upstream_outputs = self._run_upstream_transforms(
File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 130, in _run_upstream_transforms
node_output = self._execute_node(
File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 116, in _execute_node
upstream_outputs = self._run_upstream_transforms(
File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 130, in _run_upstream_transforms
node_output = self._execute_node(
File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 116, in _execute_node
upstream_outputs = self._run_upstream_transforms(
File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 130, in _run_upstream_transforms
node_output = self._execute_node(
File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 116, in _execute_node
upstream_outputs = self._run_upstream_transforms(
File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 130, in _run_upstream_transforms
node_output = self._execute_node(
File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 122, in _execute_node
transform_output = self._run_node_transform(node, transform_input, capture_dtypes, strict)
File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 250, in _run_node_transform
raise exc
File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 237, in _run_node_transform
transformed_data = node.op.transform(selection, input_data)
File "/usr/local/lib/python3.10/dist-packages/merlin/systems/dag/ops/feast.py", line 241, in transform
feature_array = array_constructor(feature_value).astype(
TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'
Related to #145
Steps/Code to reproduce bug
Building-and-deploying-multi-stage-RecSys
Expected behavior
QueryFaiss should filter the -1 item IDs out of the array. https://github.com/NVIDIA-Merlin/systems/blob/92340ba89f4d7984c21ac03453cabaae5142918f/merlin/systems/dag/ops/faiss.py#L110
Environment details
- Merlin version:
- Platform:
- Python version:
- PyTorch version (GPU?):
- Tensorflow version (GPU?):
Additional context
I am running the nightly merlin-tensorflow container.
Also seems related to #207 and https://github.com/NVIDIA-Merlin/Merlin/issues/485
I found a temporary workaround that is putting >> LambdaOp(lambda col: col.values[col.values>=0])
after QueryFaiss.
eg.
retrieval = (
user_features
>> PredictTensorflow(retrieval_model_path)
>> QueryFaiss(faiss_index_path, topk=topk_retrieval)
>> LambdaOp(lambda col: col.values[col.values>=0])
)