CalibTIP
CalibTIP copied to clipboard
sequential adaquant updating only one batch with quantized input?
https://github.com/itayhubara/CalibTIP/blob/69077c92611b079234706784c344e8c9156f3283/main.py#L481
[0] index into the first batch.
isn't sequential adaquant supposed to update the input cache of all batches to the quantized values?
In line 465 I set quantize to true thus all values are quantized. In line 481 I just replace the FP32 record I have with the results from the quantized model.
Hi itayhubara,
Thanks for replying. Yes the quantize is set to True. But in the record that you will be using for optimization, only the first batch is updated (the replacement action you mentioned).
This is because cached_input_output is organized as [layer1, layer2 ... ] and each layer is [batch1, batch2, ...]. So in line 481, only the first batch's FP32 record is replaced with the first batch of the quantized values.
Could you let me know if this is true? Thanks!