inference
inference copied to clipboard
ENH: `quantization=None` should be more intuitive when launching models rather than `quantization='none'`
Note that the issue tracker is NOT the place for general support.
Using quantization=None
indeed can launch the model and run successfully. However, even if the model is loaded with quantization=None
, it will still be converted to quantization='none'
when creating the model instance. In this case, None
represents the failure to correctly match the model.
You can refer to the code for more details https://github.com/xorbitsai/inference/blob/main/xinference/model/llm/core.py#L202.
This issue is stale because it has been open for 7 days with no activity.
This issue was closed because it has been inactive for 5 days since being marked as stale.