feast icon indicating copy to clipboard operation
feast copied to clipboard

Error in FileOfflineStore.get_historical_features.<locals>.evaluate_historical_retrieval()

Open franciscojavierarceo opened this issue 2 years ago • 3 comments

Expected Behavior

.to_df() should return a dataframe.

Current Behavior

When trying to execute .to_df() after executing .to_df(validation_reference=<some-validation-reference/>) the self.evaluation_function().compute() call in the _to_df_internal() method inside the FileRetrievalJob class fails.

While the Data Quality monitoring feature the is implemented within validation_reference is still in alpha, .to_df() should not be an issue.

Steps to reproduce

I've provided a minimally reproducible example in this notebook

Specifications

  • Version: 0.23
  • Platform: Python3.8
  • Subsystem:

Possible Solution

Following the stack trace it appears that there's an issue with the created date.

franciscojavierarceo avatar Aug 11 '22 03:08 franciscojavierarceo

hey @franciscojavierarceo, thanks for reporting this and adding a notebook! I was able to repro the bug very easily

I think the issue is that that your saved dataset is being stored in data/driver_stats.parquet, which is the location of the file source - hence the file source is being overwritten (and in particular, the data at that location no longer as a created column), and so once you try to run the historical retrieval job again, it fails since the created column has essentially been deleted from its perspective

is there a particular reason you're trying to save the dataset to data/driver_stats.parquet?

felixwang9817 avatar Aug 11 '22 21:08 felixwang9817

I think the issue is that that your saved dataset is being stored in data/driver_stats.parquet, which is the location of the file source - hence the file source is being overwritten (and in particular, the data at that location no longer as a created column),

Should we prevent overwriting any existing files?

achals avatar Aug 11 '22 21:08 achals

@achals yup I think that's the correct solution here

felixwang9817 avatar Aug 11 '22 21:08 felixwang9817