What is the recommended `torch_dtype`?
Hello there, what would you recommend as the best torch_dtype param?? Given the tradeoffs?? Or was the model trained only using the bfloat16?? Thanks for the answer.
@CoCoNuTeK The models were trained with tf32 (a 19-bit CUDA floating point format that's a replacement for fp32). We recommend bf16 for inference, especially if your machine supports that. It should require less memory and be much faster that fp32. Please note that we are talking about the model's parameters (torch_dtype in the pipeline) here. DO NOT cast your time series into bf16 as that may result in loss of information.
@CoCoNuTeK The models were trained with
tf32(a 19-bit CUDA floating point format that's a replacement forfp32). We recommendbf16for inference, especially if your machine supports that. It should require less memory and be much faster thatfp32. Please note that we are talking about the model's parameters (torch_dtypein the pipeline) here. DO NOT cast your time series intobf16as that may result in loss of information.
Ah, okay so i just keep my datapoints in format as they are, so if its stock data, i just feed them in as is, thanks for the info. And for the finetuning part should I use bf16 aswell?
For fine-tuning, the recommended settings are in the training script which uses tf32 for training. Of course, you're free to experiment with other dtypes and hyperparameters.
P.S.: I don't want to constrain your creativity but please be mindful when applying a univariate pretrained model such as Chronos to stock data, which is often heavily influenced by external factors. :)
For fine-tuning, the recommended settings are in the training script which uses
tf32for training. Of course, you're free to experiment with other dtypes and hyperparameters.P.S.: I don't want to constrain your creativity but please be mindful when applying a univariate pretrained model such as Chronos to stock data, which is often heavily influenced by external factors. :)
I mean long term predictions for sure, but some day trading stuff could work if i try 1 tick = 5mins lets say it could find interesting stuff hopefully, i will let you know if you want.