Abu Qader

Results 11 issues of Abu Qader

**What:** Creating a CLI command for readme generation and including the new `description` from `config.yaml` in the README.

Thanks to this repo, I wanted to just share the LoRa adapters for the 30B model. I used the cleaned dataset. Maybe we can add all of the OSS adapters...

## :rocket: What This PR updates the Triton / TRT-LLM template to throw 500s when it encounters an exception. _This only applies in the non-streaming usecase_. ## :computer: How We...

This PR adds a high-performance TRT-LLM builder _and_ serving truss.

This PR adds a Truss that is to be used to profile the effects of an async Truss Server and the Triton Server. This implementation is not optimized for production...

### Overview This PR adds support for Triton + TRT-LLM engines. We allow users to define a Huggingface repository for the pre-built engines and tokenizers. We leverage the C++ TRT...

## :rocket: What This PR updates the name we use to set the hf auth token to the newly introduced `HF_TOKEN` var. ## :computer: How ## :microscope: Testing

## :rocket: What We want to error out locally when trying to build with FP8 but with non-FP8 compatible architecture. ## :computer: How ## :microscope: Testing