Quantization functions test fail on Pascal
The following tests fail on Pascal:
tests/test_functional.py::test_estimate_quantiles[float] FAILED
tests/test_functional.py::test_estimate_quantiles[half] FAILED
tests/test_functional.py::test_quantile_quantization FAILED
My guess is this is probably due to atomicAdd for floats working differently.
Tesla P40 & P100 have become popular options for homelab AI builds. While I'd completely understand if these Pascal architectures are not a priority, I just wanted to share that spike in popularity that is being driven by the cards reaching the $200 mark (for 24GB of GDDR5, or 16GB of HBM2).
Given that milestone, and the fact that the P40 in particular has strong/native INT8 performance but abysmal FP16 performance, it might be a beneficial boost to the broader Open Source community if these can be supported by everyone's favorite quantization framework :).
Related: https://www.reddit.com/r/Oobabooga/comments/11z7wrt/got_problems_with_bitsandbytes_this_may_be_a_fix/