FastChat icon indicating copy to clipboard operation
FastChat copied to clipboard

Add Support for Loading Models in 4-bit Quantized Version (Fixes #1798)

Open 02shanks opened this issue 1 year ago • 0 comments
trafficstars

Why are these changes needed?

This pull request adds support for loading models in 4-bit quantized versions. This enhancement addresses the need for more efficient model loading and storage, particularly for resource-constrained environments.

Related issue number (if applicable)

Closes #1798

Checks

  • [x] I've run format.sh to lint the changes in this PR.
  • [x] I've included any doc changes needed.
  • [x] I've made sure the relevant tests are passing (if applicable).

02shanks avatar Aug 13 '24 17:08 02shanks