quanto icon indicating copy to clipboard operation
quanto copied to clipboard

A pytorch Quantization Toolkit

Results 61 quanto issues
Sort by recently updated
recently updated
newest added

There is `AssertionError` when i tried the following codes. ``` from diffusers import LuminaText2ImgPipeline from optimum.quanto import freeze, qfloat8, quantize pipeline = LuminaText2ImgPipeline.from_pretrained("Alpha-VLLM/Lumina-Next-SFT-diffusers", torch_dtype=torch.float16).to("cuda") quantize(pipeline.transformer, weights=qfloat8) freeze(pipeline.transformer) image = pipeline("ghibli...

Context: https://x.com/marcaruel/status/1818265542442066307 Code ref: https://github.com/huggingface/optimum-quanto/blob/main/optimum/quanto/tensor/qbits/qbits.py#L146 optimum-quanto's QBitsTensor uses tensors for scale and shift, paced at group_size blocks. It's great but there are two issues: - The alpha and bias (scale...

enhancement

I have been trying to play around with the QAT pipeline with LLM text generation. I have adopted the code from `examples/nlp/text-generation/quantize_causal_lm_model.py` and used gpt2-small for my model (but same...

Hi, I’m encountering an issue when trying to quantize the model. The quantization process completes successfully, but when I attempt to calibrate or make a prediction, I receive the following...

When running the pixart sigma example on CUDA arch >= 80 with `int4` weights, the following error happens: ```shell File "/home/ubuntu/dev/quanto/optimum/quanto/tensor/qtensor_func.py", line 152, in linear return QTensorLinear.apply(input, other, bias) File...

When building a package locally, the `.h`, `.cpp` and `.cu` files are added to `MANIFEST.in` automatically by `setuptools_scm`. However, when building the package on the CI, or when installing it...

When using MarlinInt4WeightQBitsTensor and its associated optimized gemm kernel, there are issues with the weight/scales/zero-point readback as soon as parallelization increases. The consequence is that output features higher than 128...

bug
help wanted