wangchd

Results 3 comments of wangchd

> Hi @wangchengdng , > > In your first example where, did you try explicitly setting `batch_size: 0` for the cuda graph as described [here](https://github.com/triton-inference-server/common/blob/main/protobuf/model_config.proto#L770-L776)? Zero might be the default...

> Is Triton able to load the model when not using cuda graph and other optimizations? Can you provide no config.pbtxt for the model and share what config Triton generates/autocompletes...

Hello, we have similar needs, is this solved now?