GLiNER
GLiNER copied to clipboard
Quantization Support for Gliner Model during Fine-Tuning on Custom Data
I am fine-tuning the [GLiNER-medium-v2.1] on custom data and need to apply quantization for performance optimization. I am looking for guidance on:
Supported Quantization Methods:
- Which methods (e.g., Post-training, Quantization-Aware Training) are compatible during fine-tuning?
- Workflow: How can I integrate quantization into the fine-tuning pipeline? Are there recommended tools or best practices?
- Limitations: Any known issues or limitations specific to quantizing the Gliner model?
- Example: Can you provide a sample implementation or reference for applying quantization in this context?