wangchd
wangchd
> Hi @wangchengdng , > > In your first example where, did you try explicitly setting `batch_size: 0` for the cuda graph as described [here](https://github.com/triton-inference-server/common/blob/main/protobuf/model_config.proto#L770-L776)? Zero might be the default...
> Is Triton able to load the model when not using cuda graph and other optimizations? Can you provide no config.pbtxt for the model and share what config Triton generates/autocompletes...
Hello, we have similar needs, is this solved now?