transformers
transformers copied to clipboard
Use commit hash to look in cache instead of calling head
What does this PR do?
This PR tries to limit the calls to requests.head made for cached models every time we try to load them. Currently on the main branch, a call to the following objects results to the following number of underlying calls to the API:
- AutoConfig: 1 (fiou)
- AutoModel: 2 (model + config)
- AutoTokenizer: 9 (multiple tokenizer files and multiple calls to config)
- pipeline: 13 (all of the above + one extra call to config)
- a sharded model: number of shards + 2
This is a bit excessive, so this PR reduces this to the maximum it can by using the commit hash of the first file downloaded: if it's the same as something we have in the cache, then all files in that subfolder with the same commit hash are up to date.
As you can see in the tests it does not completely succeed, because we can't detect with this reasoning if a file does not exist in the repo: if it's not in the cache, it could be because it's still not downloaded yet. But still it reduces the number of calls seen above to:
- AutoConfig: 1
- AutoModel: 1
- AutoTokenizer: between 2 and 4 depending on the tokenizer
- pipeline: between 2 and 4 depending on the tokenizer
- a sharded model: 2
The documentation is not available anymore as the PR was closed or merged.