Hannes Fassold

Results 7 comments of Hannes Fassold

No, paper doesn't mention retraining, as far as I could understand it Also in the code I cannot find any training loop. Main function for generating quantized model seems to...

@WesCook Thanks, SpQR looks also interesting. Although AWQ seems to be the 'easier' format (to understand and implement). Just from a first look at both papers.

use the bitsandbytes 4-byte quantization instead

See https://github.com/open-mmlab/mmcv/issues/3302#issuecomment-3214056533

You can build flashattn packe from source

Ovis2.5 runs fine with recent Transformer versions It should NOT be included (like Ovis-2) in the 'officiall' transformers package (see https://github.com/AIDC-AI/Ovis/issues/124 )