cj-zhang

Results 4 issues of cj-zhang

…ModelBuilder. *Issue #, if available:* *Description of changes:* TRTLLM containers allow compilation and quantization at the same time, so the validations to ensure mutual exclusivity must be removed. Compilation using...

- Add `deployment_config` field on `SagemakerEndpoint` to allow SageMaker model and compute definitions. - To be used with the SageMaker PythonSDK ModelBuilder class to enable just-in-time deployments.

Introduce some ContentHandler templates for common models i.e. the Llama family. Could save users time and also provide an example of how to write their own content handlers. ``` class...

Proposal to introduce the `aioboto3` client to make SageMaker Runtime API calls, as boto3 calls are network blocking. SagemakerEndpoint currently does not have its own implementation of `_acall` so one...