Hannes Fassold
Hannes Fassold
No, paper doesn't mention retraining, as far as I could understand it Also in the code I cannot find any training loop. Main function for generating quantized model seems to...
@WesCook Thanks, SpQR looks also interesting. Although AWQ seems to be the 'easier' format (to understand and implement). Just from a first look at both papers.
use the bitsandbytes 4-byte quantization instead
See https://github.com/open-mmlab/mmcv/issues/3302#issuecomment-3214056533
Thank you @gboeer !!!
You can build flashattn packe from source
Ovis2.5 runs fine with recent Transformer versions It should NOT be included (like Ovis-2) in the 'officiall' transformers package (see https://github.com/AIDC-AI/Ovis/issues/124 )