Merlin
Merlin copied to clipboard
[Task] Set up integration tests for Merlin example notebooks
Problem:
- Currently, we have unit tests that cover partial dataset for the example notebooks. We cannot guarantee that example notebooks will be functional for the full dataset.
Goal:
- Integration tests that will confirm that example notebooks will be fully functional against the full data set.
Constraints:
- Previously, framework support for TF, PyT and HCTR was available only in NVTabular. With the recent architectural changes support must be made available in Merlin systems. Currently only TF support is available.
Starting Point:
- [ ] Create support for HugeCTR in Merlin Systems for inference
- [ ] Create support for Pytorch in Merlin Systems for inference
- [ ] Refactor integration tests in NVTabular, to remove all non feature processing.
- [ ] Create e2e example notebook for HugeCTR in merlin repo
- [ ] Create e2e example notebook for Pytorch in merlin repo
- [ ] Create CI for hugectr e2e notebook example in merlin repo
- [ ] Create CI for pytorch e2e notebook example in merlin repo
- [ ] Create jobs to run integration tests on different GPU architectures(?)
- [ ] Create jobs to run integration tests on different multi-GPU architectures(?)
- [ ] Create new system for reporting metrics to ASVDB
- [ ] Refactor integration tests to report captured metrics
- [ ] Setup system to display ASVDB metric (internally)
- [ ] Create DLs for ASVDB notifications - per repo (PIC, SIC, ??)
- [ ] Hook in appropriate DLs for Monitored metrics
All
- [ ] Create boilerplate for notebooks running integration tests using testbook
- [ ] https://github.com/NVIDIA-Merlin/Merlin/issues/285
- [ ] Add integration tests that use real data nightly
Merlin Models
- [ ] #212
- [ ] https://github.com/NVIDIA-Merlin/models/issues/482
Merlin Systems
- [ ] #213
Merlin
- [x] #214
@karlhigley @bschifferer , please add details such as the problem, goal and constraints. Let me know from a task perspective whether this ticket captures everything.
We need input on who will be assigned to these tasks in this ticket.
@viswa-nvidia follow up with Julio and check off done items. Review during the CI sync
I do not understand this Roadmap ticket. I think the ticket should be setup and cleanup integration tests for EXISTING Merlin example notebooks. The ticket contains too many requests/examples, which are standalone features and not only integration tests.
I think we need to create following Roadmap tickets
Inference Support for HugeCTR in Merlin Systems:
- Create support for HugeCTR in Merlin Systems for inference -> This is a feature of Merlin Systems to support HugeCTR
- Create e2e example notebook for HugeCTR in merlin repo -> This is adding an example for the new feature
- Create CI for hugectr e2e notebook example in merlin repo -> This adds the integration test for the new example
Only the last point is an integration test task. This should be an own ticket with the 3 bullets. Otherwise, we would have one Roadmap ticket Setup unittest for Merlin and it contains developing the feature, examples, unittest, etc etc.
Similar:
Support for PyTorch Inference in Merlin Systems:
- Create support for Pytorch in Merlin Systems for inference -> This is a feature of Merlin Systems to support PyTorch
- Create e2e example notebook for Pytorch in merlin repo -> This is adding the example for the new features`. Is the e2e example using Merlin Models or native PyTorch? If we need Merlin Models PyTorch support first, we need to extend the ticket.
- Create CI for pytorch e2e notebook example in merlin repo -> This adds integration test for the new example