neuralforecast Mac MINI M4 2024 Forecast on GPU Memory Issue. On CPU is fine

What happened + What you expected to happen

Hello i'm using latest version 3.0.0 on Python 3.9 or 3.10 on a brand new .venv

What i get is that when i'm training a model for a "small" dataframe reading about 6MB of csv (attached), on CPU (accelerator="cpu") it goes fine and even with 27 days concat it goes fine. When i remove the accelerator line the error i got prevent me to fit the model since it's triggering the GPU enabled. This is so strange in my opinion. I can attach the .csv data and the notebook i'm using.

message.txt

Review of the Error: RuntimeError: Invalid buffer size: 17.62 GB

NOTE: 17.62 is just a case, if i use a different structure or change something it could be also 42 GB and it feels really strange since Neural Networks should work on so little data in my few experience...

Can you help in some way? Thanks!

Versions / Dependencies

Python 3.9/10 on MacMini latest OS. Libs are the default doing

pip install neuralforecast==3.0.0

Reproduction script

bug-report-memory.ipynb.zip

bug_report_data.csv

Issue Severity

High: It blocks me from completing my task.

Mar 21 '25 15:03 Mogiaro

Hello! Can you try reducing the batch size? That usually solves out-of-memory errors. For example, you could do:

nhits = NHITS(
    h=horizon, 
    input_size=input_size, 
    batch_size=32,
    windows_batch_size=32
)

Let me know if this works!

Mar 24 '25 13:03 marcopeix

No, changing batch size is not make it work. I tried 32, 16, 8

Mar 24 '25 14:03 Mogiaro

Options (outside of the possible M4 GPU issue):

Try using n_block=2 in TSMixer, your TSMixer model is huge (17.7M parameters).
Set windows_batch_size=32, inference_windows_batch_size=32.
Remove the static_df.
Set n_series to the amount of series you want to jointly predict. Preprocess the data accordingly (each column should be a unique_id) TSMixer is now a bit of a weird choice as you're doing univariate forecasting (n_series=1) with a multivariate model. I assume you are interested in doing multivariate forecasting. If not, TSMixer is not a good choice of model. Switch to e.g. NHITS.

Mar 24 '25 14:03 elephaint

With n_blocks=2 it does change the num of params but the error of buffer size still remains. I could try another model, but i believe the buffer size problem for GPU will still remain

Mar 24 '25 14:03 Mogiaro

With n_blocks=2 it does change the num of params but the error of buffer size still remains. I could try another model, but i believe the buffer size problem for GPU will still remain

What difference did the other suggestions make?

I was also thinking, try to include only data that the model also consumes. So, if the model doesn't use the exogenous variables in your data, drop them from the input dataframe. I think we currently lack a prefilter for unused data.

Mar 24 '25 19:03 elephaint

I tried passing only 5 columns in total (id, ds, y + 2 exogVar). That went well for 1 day. But if I take more days to have more train data, it is not working, having the same error of buffer size.

On Mon, Mar 24, 2025 at 8:50 PM Olivier Sprangers @.***> wrote:

With n_blocks=2 it does change the num of params but the error of buffer size still remains. I could try another model, but i believe the buffer size problem for GPU will still remain

What difference did the other suggestions make?

I was also thinking, try to include only data that the model also consumes. So, if the model doesn't use the exogenous variables in your data, drop them from the input dataframe. I think we currently lack a prefilter for unused data.

— Reply to this email directly, view it on GitHub https://github.com/Nixtla/neuralforecast/issues/1297#issuecomment-2749234908, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATLXJTK5MHLEKNJ2AUMAQRD2WBOYHAVCNFSM6AAAAABZQK5VKSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONBZGIZTIOJQHA . You are receiving this because you authored the thread.Message ID: @.***> [image: elephaint]elephaint left a comment (Nixtla/neuralforecast#1297) https://github.com/Nixtla/neuralforecast/issues/1297#issuecomment-2749234908

With n_blocks=2 it does change the num of params but the error of buffer size still remains. I could try another model, but i believe the buffer size problem for GPU will still remain

What difference did the other suggestions make?

I was also thinking, try to include only data that the model also consumes. So, if the model doesn't use the exogenous variables in your data, drop them from the input dataframe. I think we currently lack a prefilter for unused data.

— Reply to this email directly, view it on GitHub https://github.com/Nixtla/neuralforecast/issues/1297#issuecomment-2749234908, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATLXJTK5MHLEKNJ2AUMAQRD2WBOYHAVCNFSM6AAAAABZQK5VKSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONBZGIZTIOJQHA . You are receiving this because you authored the thread.Message ID: @.***>

Mar 25 '25 08:03 Mogiaro

In debug mode I investigated a similar issue and found that "windows_batch_size" / "inference_windows_batch_size" had no effect at all: it just used the full data set to create all the training windows. This was using a NVIDIA GPU.

I suspect this just isn't implemented for some reason. Probably also the cause of #1357

Sep 05 '25 06:09 samcraig678