tensorrt-laboratory Model chaining example

In README file is mentioned the possibility to do a model chaining call "Model Chaining: Model A -> Glue -> Model B -> ect ", but I didn't find an example in the repository.

Is there an available example? Some hints on how to do that?

I'd like to group more models in the same client call to save transfer time.

Mar 14 '19 09:03 SlipknotTN

I’ll whip up an example.

Help me understand your usecase more and I’ll see if I can get an example that help you get to where you want to go.

Mar 14 '19 19:03 ryanolson

Thank you, my use case is like this:

Client send image -> Model A TensorRT on Server -> Model B TensorRT on Server -> Custom code C++ on server -> results to the client.

Intermediate results are big in size, so I'd like to keep the processing on server end-to-end.

Mar 15 '19 09:03 SlipknotTN

The outputs of Model A are the inputs for Model B?

How about this for an example:

Decompose ResNet-152 into two TensorRT engines
- Model A = base model which consists of the first 100-ish layers
- Model B = customization model which consists of the remaining layers
- Presumably you could have many customized models that all leverage the same base model.
- The inference request will specify: base_model, customized_model
- We will use the buffer reuse options of the CyclicAllocator and the ExecutionContext to minimize the memory footprint for the transaction
Provide the custom C++ post-processing lambda
- Assume that the post-processing is large, so we'll provide some dedicated threads for "extra post-processing" outside the typical lifecycle.
- We'll add a random 1-2ms of "post-processing"

Mar 16 '19 11:03 ryanolson

Sorry for the question, but I saw that TensorRT Inference Server permits to chain more than one model (actually the feature is in development) and it is possible to add custom C++ code as custom backend model. Which the relation between this project and TensorRT Inference Server? Is this a lower level version of TRTIS?

Mar 28 '19 16:03 SlipknotTN

Good question. NvRPC in TRTIS originated from this project. I hope someday the team pulls in the tensorrt runtime.

Apr 02 '19 01:04 ryanolson

tensorrt-laboratory tensorrt-laboratory copied to clipboard

Model chaining example

tensorrt-laboratory
tensorrt-laboratory copied to clipboard