djl icon indicating copy to clipboard operation
djl copied to clipboard

What's the solutions of concurrency for AI model inference in DJL?

Open SidneyLann opened this issue 2 years ago • 1 comments
trafficstars

Description

What's the solutions of concurrency for AI model inference in DJL? Multithreads can access a model in the same time? Support Nvidia Triton?

Will this change the current api? How?

Who will benefit from this enhancement?

References

  • list reference and related literature
  • list known implementations

SidneyLann avatar Nov 03 '23 21:11 SidneyLann

DJL is a low level library. We have DJLServing as a model server which is designed as a general inference platform. And we do support running tritoncore inside DJLServing. Please take a look: https://docs.djl.ai/master/docs/serving/index.html

frankfliu avatar Nov 05 '23 18:11 frankfliu