[BUG] Loading metrics is extremely slow
In [7]: %time evaluate.load('f1')
Using the latest cached version of the module from /opt/huggingface/modules/evaluate_modules/metrics/evaluate-metric--f1/0ca73f6cf92ef5a268320c697f7b940d1030f8471714bffdb6856c641b818974 (last modified on Mon Mar 27 21:11:41 2023) since it couldn't be found locally at evaluate-metric--f1, or remotely on the Hugging Face Hub.
CPU times: user 44.1 ms, sys: 0 ns, total: 44.1 ms
Wall time: 1.8 s
This happens every time I load any module, and not just the first time.
In [11]: evaluate.__version__
Out[11]: '0.4.0'
Related issue:
- https://github.com/huggingface/evaluate/issues/315#issuecomment-1431204163
This workaround speeds up the loading to around 10ms:
We need to do this before importing evaluate.
import os
os.environ['HF_EVALUATE_OFFLINE'] = '1'
Yes, I set os.environ['HF_EVALUATE_OFFLINE'] = '1' but loading meteor is still quite slow. Seems it takes long to verify the nltk-data is up-to-date
Note that my workaround needs the metrics to be cached locally, so the first time one runs the script, HF_EVALUATE_OFFLINE should be 0. This sucks.
I have this same issue with evaluate.load it takes a long time even after setting HF_EVALUATE_OFFLINE to 1! Any suggestions?
I refactored my code to no longer use evaluate. It was easy enough to compute the metrics. It's so badly designed that it's not worth using it.
Thank you for the quick reply @NightMachinery !
I ended up using this workaround for now: https://github.com/huggingface/evaluate/issues/315#issuecomment-1925027335
git clone https://github.com/huggingface/evaluate.git
from evaluate import load
metric = load('/local/path/to/evaluate/metrics/accuracy/accuracy.py')