stan-kirdey

Results 2 comments of stan-kirdey

+1 on the request

You can look at AWS's DeepJavaLibrary Serving - https://github.com/deepjavalibrary/djl-serving It uses netty/java to dispatch the requests to inference, and can be configured to batch the requests dynamically based on a...