neuralforecast
neuralforecast copied to clipboard
Issue-409 Add support for datasets that can't fit in memory
As described in this issue: https://github.com/Nixtla/neuralforecast/issues/409
We assume the dataset is split across multiple parquet files - each parquet file corresponds to a single timeseries which is represented as a pandas dataframe. This PR creates a new Dataset class where the getitem method reads the parquet file corresponding to that index, and the from_data_directory() method replicates the from_df() method.
I have added a test to end of core.ipynb that checks the forecasts using this distributed dataset are the same as when the dataset is directly passed in as a pandas dataframe.