Vert53
Vert53
Tried checking for errors in "journalctl -f -n10" but nothing there.
This should really be added to the batch inferencing documentation as the example there only shows how to run 1 image. I was pretty confused until I stumbled on this...
@toretak were you able to get some python asyncio code working asynchronously with Torchserve API?
Found a way to do this by canceling the asyncio.Task. Do I assume correctly that token generation would stop and so we will not be charged for unused output tokens?
Hi @toretak what I meant Is to use asyncio for the requesting not the serving (handler). I managed to write this async code to test how fast torchserve worked on...