stan-kirdey
Results
2
comments of
stan-kirdey
+1 on the request
You can look at AWS's DeepJavaLibrary Serving - https://github.com/deepjavalibrary/djl-serving It uses netty/java to dispatch the requests to inference, and can be configured to batch the requests dynamically based on a...