systems icon indicating copy to clipboard operation
systems copied to clipboard

Enable cpp code path for `Categorify` ops

Open rjzamora opened this issue 1 year ago • 4 comments

Rolls back one of the changes in https://github.com/NVIDIA-Merlin/systems/pull/354 to improve inference performance of Categorify.

rjzamora avatar Apr 10 '24 18:04 rjzamora

the docs related errors are now resolved.

There's two remaining things for this MR:

  • fix the lint errors (reformat with black pre-commit run --all-files black) and the codespell error (typo in everything)
  • merge the NVTabular PR https://github.com/NVIDIA-Merlin/NVTabular/pull/1874
    • currently blocked by some unrelated errors with docs (which will be resolved by https://github.com/NVIDIA-Merlin/NVTabular/pull/1878 )

oliverholworthy avatar Apr 25 '24 14:04 oliverholworthy

There's two remaining things for this MR: ...

Thanks so much @oliverholworthy ! Sorry, I missed this earlier.

rjzamora avatar Apr 26 '24 22:04 rjzamora

Following the merge of https://github.com/NVIDIA-Merlin/NVTabular/pull/1874 we're now getting a new error in the tests

RuntimeError: Error: <class 'ValueError'> - Unknown column for CategorifyTransform x__values

oliverholworthy avatar Apr 30 '24 08:04 oliverholworthy

I haven't had the time to understand the test failure well enough to work through a long-term fix. However, my last commit roles back the decision to use the cpp code path for "Categorify" by default. Instead, when the WorkflowRunner is initialized, the value of the NVT_CPP_OPS environment variable will dictate which NVT ops the cpp code path will be used for.

For example, you can enable the cpp version of Categorify by deploying a triton model as follows:

NVT_CPP_OPS="Categorify" tritonserver --model-repository=/my-model/

rjzamora avatar May 31 '24 23:05 rjzamora

This is failing right now on some package mismatches right now. A solution is to update the tritonclient package to 2.30.0. Working on that solution now. Then we should be able to pick up the newest version of CI container to fix the triton errors.

jperez999 avatar Jun 07 '24 17:06 jperez999