LightCompress icon indicating copy to clipboard operation
LightCompress copied to clipboard

only support hopper GPU?

Open QingshuiL opened this issue 10 months ago • 3 comments

Very good work, but I have some questions to consult.

When I tried to run the code, I encountered the following error.

[rank0]: Traceback (most recent call last): [rank0]: File "/.conda/envs/torch2.4/lib/python3.10/site-packages/triton/language/core.py", line 35, in wrapper [rank0]: return fn(*args, **kwargs) [rank0]: File "/.conda/envs/torch2.4/lib/python3.10/site-packages/triton/language/core.py", line 993, in to [rank0]: return semantic.cast(self, dtype, _builder, fp_downcast_rounding) [rank0]: File "/.conda/envs/torch2.4/lib/python3.10/site-packages/triton/language/semantic.py", line 759, in cast [rank0]: assert builder.options.allow_fp8e4nv, "fp8e4nv data type is not supported on CUDA arch < 89" [rank0]: AssertionError: fp8e4nv data type is not supported on CUDA arch < 89

So the project is still not out of the fp8 restrictions?

QingshuiL avatar Feb 20 '25 13:02 QingshuiL

I can only use the a100 GPU now. Will the quantization and inference of the A100 GPU be supported in the future? How many A100-40G GPUs are required if supported?

QingshuiL avatar Feb 21 '25 03:02 QingshuiL

Which model do you need to quantize? Please provide your configuration file.

gushiqiao avatar Feb 24 '25 07:02 gushiqiao

The latest code should now be able to run FP8 models on A100 GPUs.

gushiqiao avatar May 07 '25 08:05 gushiqiao