bigbird
bigbird copied to clipboard
Precision equals Recall in run_classifier.py script run.
I am trying to replicate the results of the paper. I ran run_classifier.py
script for 7000 train-steps on imdb reviews. After every 1000 batches, we see precision, recall, accuracy, F1 score and loss printed on the terminal. For all the checkpoints, precision=recall=F1=accuracy up to all decimal points. I wonder if this has some mistake in calculation. For a binary dataset, we should not have precision=recall=accuracy.
For e.g. for ckpt-1000, I got 0.9408210 as the values for p, r, a, f1.
Hi, I think they used prec@k instead of precision: https://github.com/google-research/bigbird/blob/db06498ec8804c6438111938d8654b66ddaccd5d/bigbird/classifier/run_classifier.py#L282-L283
Here are the following official docs for these two: Precision: https://www.tensorflow.org/api_docs/python/tf/compat/v1/metrics/precision P@k: https://www.tensorflow.org/api_docs/python/tf/compat/v1/metrics/precision_at_k
Precision is the traditional precision metric we used.
I personally tried the tf.compat.v1.metrics.precision
and it worked. Here is the sample code:
precision, precision_op = tf.compat.v1.metrics.precision(
labels=label_ids, predictions=predictions, weights=None, name="precision")
The same with Recall. Finally I got precision of 0.9483 and recall of 0.9606 with training step of 2,000.
Hope this helps.