Jaehong Kim

Results 22 comments of Jaehong Kim

QAT model is only for training. So the weights for QAT model is a float form. For 8 bit quantization, we do actual quantization during TFLite conversion. (inference also just...

We still not fully supports subclass model yet. The model we mentioned on that blog post means we wrote some code to supports a kind of subclass model, which implemented...

We have a experimental n-bit quantization scheme here: https://github.com/tensorflow/model-optimization/tree/master/tensorflow_model_optimization/python/core/quantization/keras/experimental/default_n_bit We have a plan to release an example that shows how to use this default_n_bit scheme on tensorflow official vision model....

Hi the model input shapes are (224, 224, 3), so you can try size=(1, 224, 224, 3) on representative_dataset function. Thanks.

Would you please provide more details? Did you try to use TF1 QAT on TF1 model but it raise that error?

If you run inference for a QAT model, it already simulate 4bits with fake-quant. Only difference is we use float32 op. Basically, input and weight is quant-dequant by injected fake-quant....

Hi, would you please give us some examples? We usually assume BNs would be folded (fused) to nearby layer for optimization. I'd like to know some use-cases that when it...

AFAIK evenif you apply prune_low_magnitude to sublayers of your subclass model, the pruning scheduling logic (callback based) during the training is not working due to it can't find your pruned...

It looks has some rounding computational numerical issue. (e.g. zero_point_from_min=127.5, but sometimes zero_point_from_min=127.49999) AFAIK, fake_quant op implementation has some numerical error: [as-is] (clamped_shifted / nudged_scale_repl + 0.5f).floor() * nudged_scale_repl +...

Hi, Would you please explain how did you make Q_quant more details? Is that quantized if you only use B_quant class?