Mac MINI M4 2024 Forecast on GPU Memory Issue. On CPU is fine
What happened + What you expected to happen
Hello i'm using latest version 3.0.0 on Python 3.9 or 3.10 on a brand new .venv
What i get is that when i'm training a model for a "small" dataframe reading about 6MB of csv (attached), on CPU (accelerator="cpu") it goes fine and even with 27 days concat it goes fine. When i remove the accelerator line the error i got prevent me to fit the model since it's triggering the GPU enabled. This is so strange in my opinion. I can attach the .csv data and the notebook i'm using.
Review of the Error: RuntimeError: Invalid buffer size: 17.62 GB
NOTE: 17.62 is just a case, if i use a different structure or change something it could be also 42 GB and it feels really strange since Neural Networks should work on so little data in my few experience...
Can you help in some way? Thanks!
Versions / Dependencies
Python 3.9/10 on MacMini latest OS. Libs are the default doing
pip install neuralforecast==3.0.0
Reproduction script
Issue Severity
High: It blocks me from completing my task.
Hello! Can you try reducing the batch size? That usually solves out-of-memory errors. For example, you could do:
nhits = NHITS(
h=horizon,
input_size=input_size,
batch_size=32,
windows_batch_size=32
)
Let me know if this works!
No, changing batch size is not make it work. I tried 32, 16, 8
Options (outside of the possible M4 GPU issue):
-
Try using
n_block=2in TSMixer, your TSMixer model is huge (17.7M parameters). -
Set windows_batch_size=32, inference_windows_batch_size=32.
-
Remove the static_df.
-
Set n_series to the amount of series you want to jointly predict. Preprocess the data accordingly (each column should be a unique_id) TSMixer is now a bit of a weird choice as you're doing univariate forecasting (n_series=1) with a multivariate model. I assume you are interested in doing multivariate forecasting. If not, TSMixer is not a good choice of model. Switch to e.g. NHITS.
With n_blocks=2 it does change the num of params but the error of buffer size still remains. I could try another model, but i believe the buffer size problem for GPU will still remain
With n_blocks=2 it does change the num of params but the error of buffer size still remains. I could try another model, but i believe the buffer size problem for GPU will still remain
What difference did the other suggestions make?
I was also thinking, try to include only data that the model also consumes. So, if the model doesn't use the exogenous variables in your data, drop them from the input dataframe. I think we currently lack a prefilter for unused data.
I tried passing only 5 columns in total (id, ds, y + 2 exogVar). That went well for 1 day. But if I take more days to have more train data, it is not working, having the same error of buffer size.
On Mon, Mar 24, 2025 at 8:50 PM Olivier Sprangers @.***> wrote:
With n_blocks=2 it does change the num of params but the error of buffer size still remains. I could try another model, but i believe the buffer size problem for GPU will still remain
What difference did the other suggestions make?
I was also thinking, try to include only data that the model also consumes. So, if the model doesn't use the exogenous variables in your data, drop them from the input dataframe. I think we currently lack a prefilter for unused data.
— Reply to this email directly, view it on GitHub https://github.com/Nixtla/neuralforecast/issues/1297#issuecomment-2749234908, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATLXJTK5MHLEKNJ2AUMAQRD2WBOYHAVCNFSM6AAAAABZQK5VKSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONBZGIZTIOJQHA . You are receiving this because you authored the thread.Message ID: @.***> [image: elephaint]elephaint left a comment (Nixtla/neuralforecast#1297) https://github.com/Nixtla/neuralforecast/issues/1297#issuecomment-2749234908
With n_blocks=2 it does change the num of params but the error of buffer size still remains. I could try another model, but i believe the buffer size problem for GPU will still remain
What difference did the other suggestions make?
I was also thinking, try to include only data that the model also consumes. So, if the model doesn't use the exogenous variables in your data, drop them from the input dataframe. I think we currently lack a prefilter for unused data.
— Reply to this email directly, view it on GitHub https://github.com/Nixtla/neuralforecast/issues/1297#issuecomment-2749234908, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATLXJTK5MHLEKNJ2AUMAQRD2WBOYHAVCNFSM6AAAAABZQK5VKSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONBZGIZTIOJQHA . You are receiving this because you authored the thread.Message ID: @.***>
In debug mode I investigated a similar issue and found that "windows_batch_size" / "inference_windows_batch_size" had no effect at all: it just used the full data set to create all the training windows. This was using a NVIDIA GPU.
I suspect this just isn't implemented for some reason. Probably also the cause of #1357