RFC: Mechanism to indicate that tflite model is structurally pruned.
Mechanism to indicate that tflite model is structurally pruned.
| Status | Draft |
|---|---|
| RFC # | 398 |
| Author(s) | Elena Zhelezina ([email protected]) |
| Sponsor | David Rim ([email protected]) |
Objective
When we have the structurally pruned tflite model, then the only way for hardware to identify this to get benefits is to scan all weights and do checks. This increases the inference time and should be done every time, when the model is loaded. The goal of this RFC is to add special flag to tflite file to mark such models.
It saves model loading time but increases model conversion time. I have two questions about this proposal:
- How to map between the flag and tensors(weight)
- What additional check will be done to the indicated tensors?
Yes, when this flag is set, we need to check every weight(tensor) of the model. As result, the conversion time will be bigger, but inference time will be better. Usually, conversion is done on the more powerful machines, so it is better to do this during the conversion. The only check is to identify weights pruned with m/n sparsity for Conv2D and Dense layers.
Has this been or is this ready for community review?
@rino20, can you take over as sponsor?
Sorry for all the comments, no it's not ready for community review yet.
closing as it is not relevant anymore