bug:postgres source invalid `pull_latest_from_table_or_query` query output
I would submit a PR but I'm on an M1 and had some issues getting the env running. Submitting a proposed solution for the sake of available time.
Expected Behavior
When using PostgreSQLSource, you should be able to use a table successfully. When running feast masterialize [START_DATE] [END_DATE], as a user I would expect the table to be selected on successfully with no errors.
source=PostgreSQLSource(
name="source_user_events_v0",
timestamp_field="created_at",
table="offline_store.user_events"
)
Current Behavior
pull_latest_from_table_or_query attempts to pull the latest data using the outputted string from get_table_query_string. However on line 97 of postgres.py the query includes open and close parens assuming an inner query. These params should be removed from the query and handled by get_table_query_string (which it is).
Steps to reproduce
Create a source as part of a feature view. Reference an existing table.
source=PostgreSQLSource(
name="source_user_events_v0",
timestamp_field="created_at",
table="offline_store.user_events"
)
execute feast materialize [START_DATE] [END_DATE]
expected output.
root@3b9fc17aa598:/usr/app# feast materialize 2023-10-16T00:00:00 2023-10-18T00:00:00
/usr/local/lib/python3.9/site-packages/feast/repo_config.py:233: RuntimeWarning: `entity_key_serialization_version` is either not specified in the feature_store.yaml, or is specified to a value <= 1.This serialization version may cause errors when trying to write fields with the `Long` data type into the online store. Specifying `entity_key_serialization_version` to 2 is recommended for new projects.
warnings.warn(
Materializing 1 feature views from 2023-10-16 00:00:00+00:00 to 2023-10-18 00:00:00+00:00 into the postgres online store.
fv_user_events_v0:
Traceback (most recent call last):
File "/usr/local/bin/feast", line 8, in <module>
sys.exit(cli())
File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.9/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/click/decorators.py", line 33, in new_func
return f(get_current_context(), *args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/feast/cli.py", line 546, in materialize_command
store.materialize(
File "/usr/local/lib/python3.9/site-packages/feast/usage.py", line 299, in wrapper
raise exc.with_traceback(traceback)
File "/usr/local/lib/python3.9/site-packages/feast/usage.py", line 288, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/feast/feature_store.py", line 1395, in materialize
provider.materialize_single_feature_view(
File "/usr/local/lib/python3.9/site-packages/feast/infra/passthrough_provider.py", line 254, in materialize_single_feature_view
raise e
File "/usr/local/lib/python3.9/site-packages/feast/infra/materialization/local_engine.py", line 156, in _materialize_one
table = offline_job.to_arrow()
File "/usr/local/lib/python3.9/site-packages/feast/infra/offline_stores/offline_store.py", line 122, in to_arrow
return self._to_arrow_internal(timeout=timeout)
File "/usr/local/lib/python3.9/site-packages/feast/infra/offline_stores/contrib/postgres_offline_store/postgres.py", line 282, in _to_arrow_internal
cur.execute(query)
psycopg2.errors.SyntaxError: syntax error at or near ")"
LINE 8: FROM (offline_store.user_events) a
^
root@3b9fc17aa598:/usr/app#
Specifications
- Version: 0.34.1
- Platform: Linux
- Subsystem: Ubuntu 20.04
Possible Solution
Update this query from:
query = f"""
SELECT
{b_field_string}
{f", {repr(DUMMY_ENTITY_VAL)} AS {DUMMY_ENTITY_ID}" if not join_key_columns else ""}
FROM (
SELECT {a_field_string},
ROW_NUMBER() OVER({partition_by_join_key_string} ORDER BY {timestamp_desc_string}) AS _feast_row
FROM ({from_expression}) a
WHERE a."{timestamp_field}" BETWEEN '{start_date}'::timestamptz AND '{end_date}'::timestamptz
) b
WHERE _feast_row = 1
"""
to:
query = f"""
SELECT
{b_field_string}
{f", {repr(DUMMY_ENTITY_VAL)} AS {DUMMY_ENTITY_ID}" if not join_key_columns else ""}
FROM (
SELECT {a_field_string},
ROW_NUMBER() OVER({partition_by_join_key_string} ORDER BY {timestamp_desc_string}) AS _feast_row
FROM {from_expression} a
WHERE a."{timestamp_field}" BETWEEN '{start_date}'::timestamptz AND '{end_date}'::timestamptz
) b
WHERE _feast_row = 1
"""
hello! I ran exactly into the same issue and fixed it by applying the fix suggested by @david-dest01.
thanks for testing @danielsalvador
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Am I mistaken or was the PR to fix this never merged?
I'm experiencing the same issue at the moment.
@job-almekinders the ticket was closed by maintainers - it was never resolved. https://github.com/feast-dev/feast/pull/3807
I'll re-open one tomorrow one tomorrow, after rebasing on master!
I re-opened a similar PR #4026
@job-almekinders @david-dest01 thanks for re-opening and tackling this. We've recently (over the last several months) transitioned to new maintainers/contributors and it's been a bit tricky dealing with some of the older issues & PR's that we inherited and figuring out what's still relevant & active. We're definitely open to community contributions/issues! Much appreciated :)
@jeremyary the previous PR was most likely eventually closed because of a failed DCO check. just saying... 😄
@jeremyary - glad to see feast getting some love :)