chronos-forecasting
chronos-forecasting copied to clipboard
How to do inference without connecting to HuggingFace?
The connection between my server and Hugging Face is not very smooth. I have downloaded the model weights. I would like to know if it is possible to close the connection to Hugging Face before calling Chronos, it often takes a lot of time and may fail. Thanks!
@ForestsKing typically, using a HF model prefix should not have a significant overhead. However, if you're facing issues with your connection, you might try downloading the model first and loading from a local path. Here's how to do it:
- Download the model. You can do this in one of the following ways:
- Clone the HF repo using
git lfsas described here. - OR If you have used the model once, it should already be in your cache. HF models are saved in
~/.cache/huggingface/hub/models--<model-name>/snapshots/<commit-hash>/. Here's an example path from my machine~/.cache/huggingface/hub/models--amazon--chronos-t5-small/snapshots/6cb0a414b8bc7ed3cfdcb7edac48a9778dd175f8/. You can copy this directory to another more accesible directory.
- Clone the HF repo using
- Once you have the model in a local path (let's say
./checkpoints/chronos-t5-small/), you can load it as follows:
import torch
from chronos import ChronosPipeline
pipeline = ChronosPipeline.from_pretrained(
"./checkpoints/chronos-t5-small",
device_map="cuda",
torch_dtype=torch.bfloat16,
)
Thank!
Leaving open as FAQ