uni2ts
uni2ts copied to clipboard
Bug when trying to prepare custom dataset for finetuning
I've run into a bug that I can't fix when trying to prepare a dataset for finetuning.
Here's the code:
def data_generator() -> Generator[dict[str, Any]]:
yield {
"target": df['Weekly_Sales'].to_numpy(),
"start": df.index[0],
"freq": pd.infer_freq(df.index),
"item_id": "1",
}
features = Features(
dict(
target=Sequence(Value("float32")),
start=Value("date32")),
freq=Value("string"),
item_id=Value("string"),
)
hf_dataset = Dataset.from_generator(data_generator, features=features)
hf_dataset.save_to_disk(Path("sales_dataset/"))
df = hf_dataset.to_pandas()
df.to_csv('sales_dataset/sales_data.csv', index=False)
Then, when I run python -m uni2ts.data.builder.simple sales_data sales_dataset/sales_data.csv --offset 40 --dataset_type long , I get the error:
IndexError: index 0 is out of bounds for axis 0 with size 0. Not sure why that happens, as my df is not empty, and the .csv is not empty either.
What am I missing?
Hi @marcopeix,
Could you please provide a sample .csv you used? We can look more into it.
@liu-jc sure here's the CSV I'm using: https://raw.githubusercontent.com/marcopeix/FoundationModelsForTimeSeriesForecasting/main/data/walmart_sales_small.csv
I'm only using data for Store==1 (143 rows of data) and the first three columns only (Store, Date, Weekly_Sales). Prior to running the function, I set the index as the Date column.
@liu-jc, did you have time to take a look at this? It's blocking me in my progress! Thanks!
closed, the same issue as #122