chronos-forecasting How to do inference without connecting to HuggingFace?

The connection between my server and Hugging Face is not very smooth. I have downloaded the model weights. I would like to know if it is possible to close the connection to Hugging Face before calling Chronos, it often takes a lot of time and may fail. Thanks!

Mar 31 '24 03:03 ForestsKing

@ForestsKing typically, using a HF model prefix should not have a significant overhead. However, if you're facing issues with your connection, you might try downloading the model first and loading from a local path. Here's how to do it:

Download the model. You can do this in one of the following ways:
- Clone the HF repo using git lfs as described here.
- OR If you have used the model once, it should already be in your cache. HF models are saved in ~/.cache/huggingface/hub/models--<model-name>/snapshots/<commit-hash>/. Here's an example path from my machine ~/.cache/huggingface/hub/models--amazon--chronos-t5-small/snapshots/6cb0a414b8bc7ed3cfdcb7edac48a9778dd175f8/. You can copy this directory to another more accesible directory.
Once you have the model in a local path (let's say ./checkpoints/chronos-t5-small/), you can load it as follows:

import torch
from chronos import ChronosPipeline

pipeline = ChronosPipeline.from_pretrained(
    "./checkpoints/chronos-t5-small",
    device_map="cuda",
    torch_dtype=torch.bfloat16,
)

Mar 31 '24 12:03 abdulfatir

Thank!

Mar 31 '24 12:03 ForestsKing

Leaving open as FAQ

Mar 31 '24 17:03 lostella