feast icon indicating copy to clipboard operation
feast copied to clipboard

data sources displayed different depending on "inline" definition

Open cburroughs opened this issue 1 year ago • 1 comments

Expected Behavior

Many of the docs define data sources "inline" from the feature view. For example https://docs.feast.dev/getting-started/concepts/feature-view

driver_stats_fv = FeatureView(
    #...
    source=BigQuerySource(
        table="feast-oss.demo_data.driver_activity"
    )
)

I would expect the above example to work the same with the BigQuerySource defined as is, or if it did foo = BigQuerySource() and foo was passed in.

Current Behavior

node = feast.Entity(name='node', join_keys=['node'])

node_temp_file_fv = feast.FeatureView(
    name='node_temperature_fv',
    entities=[node],
    schema=[
        feast.Field(name='node', dtype=feast.types.String),
        feast.Field(name='temp_f', dtype=feast.types.Int64),
    ],
    online=False,
    source=feast.FileSource(
        name='temperature_source',
        path='./data/temperature_source.parquet',
        timestamp_field='timestamp',
    )
)
$ feast apply
Created entity node
Created feature view node_temperature_fv

Created sqlite table demo_node_temperature_fv
 $ feast data-sources list
NAME    CLASS

Change the above definition:

node_temperature_file_source = feast.FileSource(
    name='temperature_source',
    path='./data/temperature_source.parquet',
    timestamp_field='timestamp',
)

node_temp_file_fv = feast.FeatureView(
    name='node_temperature_fv',
    entities=[node],
    schema=[
        feast.Field(name='node', dtype=feast.types.String),
        feast.Field(name='temp_f', dtype=feast.types.Int64),
    ],
    online=False,
    source=node_temperature_file_source
)
$ feast apply
No changes to registry
No changes to infrastructure

But despite "no changes" being made, the source is now listed:

$ feast data-sources list
NAME                CLASS
temperature_source  <class 'feast.infra.offline_stores.file_source.FileSource'>

Specifications

  • Version: Feast SDK Version: "feast 0.22.2"
  • Platform: x86_64 on Python 3.9.12
  • Subsystem: Linux 5.4.188

cburroughs avatar Aug 01 '22 22:08 cburroughs

I was able to repro this - the issue is with how we're parsing repos (parse_repo in repo_operations.py)

specifically, when we have an inline data source defined, we're just not registered it to be applied

I should have a fix out soon!

felixwang9817 avatar Aug 05 '22 23:08 felixwang9817