genienlp Don't request model from HuggingFace when running prediction

Don't request model from HuggingFace when running prediction

Open s-jse opened this issue 2 years ago • 3 comments

When running genienlp predict on a local model, we should not send any requests to HuggingFace servers at all. I think the transformers library sends a request by default to resolve model names using the most up-to-date model list. Yesterday, HF servers were down, and my runs on our local server would crash.

Setting TRANSFORMERS_OFFLINE=1 as an environment variable when we are loading the model from disk should work. There might be other solutions as well.

Feb 04 '22 01:02 s-jse

From this post, it seems passing local_files_only=True when loading the model works too.

Feb 04 '22 02:02 Mehrad0711

The download happens only when the .embeddings directory does not exist, or does not contain the model being used. In this case, whenever we reach a line like config = AutoConfig.from_pretrained(args.pretrained_model, cache_dir=args.embeddings) in TransformerSeq2Seq, HF automatically downloads the config.json file from HF servers, which we then pass to the HF model's constructor by calling super().__init__(config) This config file is stored in the object and determines the correct behavior of various internal methods. We do not ever save this config file when we save a model, so there is no local copy of it.

All this to say that resolving this issue is not easy.

Aug 20 '22 21:08 s-jse

Should we start saving the HF config files then? Then we set local_files_only to True only if the config file is detected in --path and False otherwise (for backward compatibility).

Aug 23 '22 03:08 Mehrad0711

genienlp genienlp copied to clipboard

Don't request model from HuggingFace when running prediction

genienlp
genienlp copied to clipboard