genienlp
genienlp copied to clipboard
Don't request model from HuggingFace when running prediction
When running genienlp predict
on a local model, we should not send any requests to HuggingFace servers at all. I think the transformers
library sends a request by default to resolve model names using the most up-to-date model list.
Yesterday, HF servers were down, and my runs on our local server would crash.
Setting TRANSFORMERS_OFFLINE=1
as an environment variable when we are loading the model from disk should work. There might be other solutions as well.
From this post, it seems passing local_files_only=True
when loading the model works too.
The download happens only when the .embeddings
directory does not exist, or does not contain the model being used. In this case, whenever we reach a line like config = AutoConfig.from_pretrained(args.pretrained_model, cache_dir=args.embeddings)
in TransformerSeq2Seq
, HF automatically downloads the config.json file from HF servers, which we then pass to the HF model's constructor by calling super().__init__(config)
This config file is stored in the object and determines the correct behavior of various internal methods. We do not ever save this config file when we save a model, so there is no local copy of it.
All this to say that resolving this issue is not easy.
Should we start saving the HF config files then?
Then we set local_files_only
to True only if the config file is detected in --path
and False otherwise (for backward compatibility).