systems
systems copied to clipboard
Use GPU tensors in Triton ensemble operators
- [ ] Use DLpack to leverage GPU memory between triton models in ensemble
- [ ] Upgrade numpy in containers and see if dlpack works with Triton tensors
- [ ] Try to build a repro of transferring cupy tensors to Triton with dlpack (re: issue with contiguous arrays)