optimum-intel Add JPQD evaluation notebook

Add JPQD evaluation notebook. Since JPQD QA takes about 12 hours to train, it doesn't make sense to do it in a notebook (if the browser crashes or the computer goes to sleep, training would stop). So I just refer to the example and use the notebook to evaluate the model.

This makes the notebook similar to the PTQ QA notebook. I thought about removing duplication but I think duplication in examples is not so bad, at least for now. It's nice that examples are standalone.

Since JPQD starts from a plain bert-base-uncased model I finetuned a bert-base-uncased model following the transformers run_qa.py example to compare performance.

Instead of making this a JPQD specific notebook, it could make more sense to make it a generic QA INT8 evaluation notebook, but on the other hand, it's an example, people can surely change it for similar purposes, and it's nice to promote JPQD.

TODO: the intro text at the top needs to explain a bit more about JPQD.

Colab link: https://colab.research.google.com/github/helena-intel/optimum-intel/blob/jpqd-notebook/notebooks/openvino/question_answering_quantization_jpqd.ipynb (performance is probably bad on Colab because there is no AVX512/VNNI).

@vuiseng9

Mar 12 '23 21:03 helena-intel

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

Mar 12 '23 21:03 HuggingFaceDocBuilderDev

I think that it could be more useful if we can show the performance and accuracy trade-offs for three models:

Original Transformer model (fp32)
Quantized model (PTQ/QAT)
Pruned and quantized (JPQD, distillation is an auxiliary method here)

Mar 21 '23 07:03 AlexKoff88

@yujiepan-work and @vuiseng9 implemented very nice lightweight tests for JPQD training. 9 epochs take just a few seconds on a single card. I'd reuse them for this notebook. https://github.com/openvinotoolkit/nncf/blob/develop/tests/torch/sparsity/movement/test_training.py#L237

if we need a very good accuracy/performance results, there are longer tests to consider: https://github.com/openvinotoolkit/nncf/blob/develop/tests/torch/sparsity/movement/test_training.py#L318 If I am not mistaken, it takes minutes. Probably, @yujiepan-work could say the exact time.

Apr 14 '23 09:04 ljaljushkin

optimum-intel optimum-intel copied to clipboard

Add JPQD evaluation notebook

optimum-intel
optimum-intel copied to clipboard