transformers Contradictory information in documentation about the ability to push qunatized models to hub

Contradictory information in documentation about the ability to push qunatized models to hub

Open amdnsr opened this issue 1 year ago • 1 comments

System Info

Using Google Colab and the main branch of the transformers library on GitHub.

Who can help?

@sgugger @stevhliu @MKhalusova

Information

[X] The official example scripts
[ ] My own modified scripts

Tasks

[X] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Reproduction

The note at the end of the section Load a large model in 4bit and Load a large model in 8bit suggests that it's not possibel to push the quantized weights on the hub:

Note that once a model has been loaded in 4-bit it is currently not possible to push the quantized weights on the Hub.

Note that once a model has been loaded in 8-bit it is currently not possible to push the quantized weights on the Hub except if you use the latest transformers and bitsandbytes.

But the example in Push quantized models on the 🤗 Hub suggests that it's possible to push quantized models to the hub. Same is suggested in Load a quantized model from the 🤗 Hub

Does it mean that push to hub is only supported for 8-bit quantized models when using the latest transformers and bitsandbytes but NOT for 4-bit models?

Or is it actually possible to push to hub for both 8-bit and 4-bit quantized models?

Expected behavior

Can 4-bit and 8-bit quantized models be pushed to hub and be loaded from hub?

Jun 30 '23 20:06 amdnsr

cc @younesbelkada

Jun 30 '23 21:06 sgugger

Hi @amdnsr Thanks for the issue as explained in the mentioned paragraphs, it is possible to push 8bit quantized weights only if you use the latest transformers + bitsandbytes. However, pushing 4bit weights is currently not supported

Jul 06 '23 07:07 younesbelkada

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Jul 31 '23 15:07 github-actions[bot]

transformers transformers copied to clipboard

Contradictory information in documentation about the ability to push qunatized models to hub

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

transformers
transformers copied to clipboard