llm-foundry icon indicating copy to clipboard operation
llm-foundry copied to clipboard

Add TRT ComposerModel inference wrapper

Open nik-mosaic opened this issue 2 years ago • 1 comments

**[WIP] Fix Batching

Adds a wrapper, similar to the OpenAI Wrappers in this PR, for TRT models.

The purpose is to be able to evaluate TRT models using our gauntlet, similar to how we evaluate HF/ComposerModels.

Results: https://docs.google.com/spreadsheets/d/1jKJki9QnB8TAt0hkNDQv_DhxMhWb10EtIR8worsLUwU/edit#gid=1219414201

nik-mosaic avatar Aug 24 '23 23:08 nik-mosaic

What's the current issue with multi-gpu?

ghost avatar Aug 25 '23 19:08 ghost