make auto the default for layer config
Description
This changes config_from_keras in 'name' mode to make the layer precisions to default to 'auto'. (It is generally recommended to pass the backend in that case. The default precision is still provided in the model level and generally is used as the real default when the precision cannot be inferred.
Note: for PTQ, this change could cause the model widths to become huge! Care must be used. Maybe a flag should be provided on whether to use autos or not?
Note: if config_from_keras is used in 'model' or 'type' mode, there is no change.
Type of change
- [x] New feature (non-breaking change which adds functionality)
- [x] Breaking change (fix or feature that would cause existing functionality to not work as expected)
Tests
Should verify the standard tests.
Checklist
- [x] I have read the guidelines for contributing.
- [x] I have commented my code, particularly in hard-to-understand areas.
- [ ] I have made corresponding changes to the documentation.
- [x] My changes generate no new warnings.
- [x] I have installed and run
pre-commiton the files I edited or added. - [x] I have added tests that prove my fix is effective or that my feature works.
Added maximum precision to the keras options. In InferPrecisionTypes, _infer_common_precision now uses the maximum precision to not make the size too big. Note: for the maximum precision, basically the total width and the integer width are considered separately. The set total width is never bigger than the maximum width, and the set integer width is never bigger than the maximum integer width. No relation between the two is enforced. Also, because of #1022, there is no attempt to enforce any maximum width in _infer_sepconv_precision. After #1022, both the depthwise and pointwise convolutions use the _infer_common_precision, and _infer_sepconv_precision is no longer used.
The test_keras_api.py::test_depthwise* and test_qkeras.py::test_qdepthwiseconv2d tests pass after #1022 is merged.
I added the backend to be passed for the "config_from_*" functions. This works much better in providing all the configuration parameters one can set.
Based on discussions with other people, we decided not to add a special type propagation fallback value--the model level value is the default. We do support a maximum precision, but there is no flag to not place auto in the configuration.
This is for after #1022 is merged, which I believe fixes the failing tests.
Can it be extended in order to be applied to QONNX ingestion as well?
I think the code in https://github.com/fastmachinelearning/hls4ml/pull/979 infers the precision from the QONNX model, so that should be covered there. The pytorch parser however will need to be updated to use auto by default, but I would prefer to not burden this PR with this and do it as a follow-up.
By the way, one thing I am not too happy about is that it seems like the optimize flow has largely disappeared, because many of the optimizers benefit from running before precisions become fixed. Generally if a precision is set I have been hesitant to ignore that precision in further optimizations or doing anything that would change the numerical result. This is why I still think it may be a good idea to take infer_precision_types out of the convert flow and move it to a later stage.
Can we even do this? I thought we agreed on the approach of inferring early since the lack of precision will trip up downstream optimizers even more
I am not sure for the backend-specific optimizers, like the type conversions, but the type-agnostic ones don't need types, or benefit from knowing that they don't have to enforce a certain type somewhere.
I think the code in #979 infers the precision from the QONNX model, so that should be covered there.
It does for weights/biases and in general for Quant nodes in the network, not for accumulators I think.