gpt-fast
gpt-fast copied to clipboard
[example] Add support for DBRX
Support dbrx from databricks. Initial perf numbers:
| | 1 GPU | 2 GPU | 4 GPU | 8 GPU |
|------------------|---------|-----------|--------|------------|
|baseline(bfloat16)| OOM | OOM | 59.53 | 100.51 |
| int8 | OOM | 66.72 | 91.21 | 146.86 |