feast
feast copied to clipboard
Error in FileOfflineStore.get_historical_features.<locals>.evaluate_historical_retrieval()
Expected Behavior
.to_df()
should return a dataframe.
Current Behavior
When trying to execute .to_df()
after executing .to_df(validation_reference=<some-validation-reference/>)
the self.evaluation_function().compute()
call in the _to_df_internal()
method inside the FileRetrievalJob
class fails.
While the Data Quality monitoring feature the is implemented within validation_reference
is still in alpha, .to_df()
should not be an issue.
Steps to reproduce
I've provided a minimally reproducible example in this notebook
Specifications
- Version: 0.23
- Platform: Python3.8
- Subsystem:
Possible Solution
Following the stack trace it appears that there's an issue with the created date.
hey @franciscojavierarceo, thanks for reporting this and adding a notebook! I was able to repro the bug very easily
I think the issue is that that your saved dataset is being stored in data/driver_stats.parquet
, which is the location of the file source - hence the file source is being overwritten (and in particular, the data at that location no longer as a created
column), and so once you try to run the historical retrieval job again, it fails since the created
column has essentially been deleted from its perspective
is there a particular reason you're trying to save the dataset to data/driver_stats.parquet
?
I think the issue is that that your saved dataset is being stored in
data/driver_stats.parquet
, which is the location of the file source - hence the file source is being overwritten (and in particular, the data at that location no longer as acreated
column),
Should we prevent overwriting any existing files?
@achals yup I think that's the correct solution here