GPTQ-triton
GPTQ-triton copied to clipboard
Cache auto-tuning?
When running the model--especially in a serverless environment where there may be many cold starts--it would be desirable to cache the auto-tuning results. Is this possible?