feast
feast copied to clipboard
feat: Updated Trino class to support custom Connection object
What this PR does / why we need it:
It updates Trino class to support more customized authorized Connection object. E.g. I have a conn that's authorized with http-headers.
Just put here for discussion, an alternative way of supporting conn
without removing _get_curosr(self)
method can be like
diff --git a/sdk/python/feast/infra/offline_stores/contrib/trino_offline_store/trino_queries.py b/sdk/python/feast/infra/offline_stores/contrib/trino_offline_store/trino_queries.py
index 1d4b5881..159a4e83 100644
--- a/sdk/python/feast/infra/offline_stores/contrib/trino_offline_store/trino_queries.py
+++ b/sdk/python/feast/infra/offline_stores/contrib/trino_offline_store/trino_queries.py
@@ -11,7 +11,7 @@ import numpy as np
import pandas as pd
import pyarrow as pa
import trino
-from trino.dbapi import Cursor
+from trino.dbapi import Connection, Cursor
from trino.exceptions import TrinoQueryError
from feast.infra.offline_stores.contrib.trino_offline_store.trino_type_map import (
@@ -36,6 +36,7 @@ class Trino:
catalog: Optional[str] = None,
auth: Optional[Any] = None,
http_scheme: Optional[str] = None,
+ conn: Optional[Connection] = None,
):
self.host = host or os.getenv("TRINO_HOST")
self.port = port or os.getenv("TRINO_PORT")
@@ -43,6 +44,7 @@ class Trino:
self.catalog = catalog or os.getenv("TRINO_CATALOG")
self.auth = auth or os.getenv("TRINO_AUTH")
self.http_scheme = http_scheme or os.getenv("TRINO_HTTP_SCHEME")
+ self._conn = conn
self._cursor: Optional[Cursor] = None
if self.host is None:
@@ -56,15 +58,17 @@ class Trino:
def _get_cursor(self) -> Cursor:
if self._cursor is None:
- self._cursor = trino.dbapi.connect(
- host=self.host,
- port=self.port,
- user=self.user,
- catalog=self.catalog,
- auth=self.auth,
- http_scheme=self.http_scheme,
- ).cursor()
-
+ if self._conn is None:
+ self._cursor = trino.dbapi.connect(
+ host=self.host,
+ port=self.port,
+ user=self.user,
+ catalog=self.catalog,
+ auth=self.auth,
+ http_scheme=self.http_scheme,
+ ).cursor()
+ else:
+ self._cursor = self._conn.cursor()
return self._cursor
def create_query(self, query_text: str) -> Query:
Which issue(s) this PR fixes:
Fixes # NA
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: zyxue
To complete the pull request process, please assign woop after the PR has been reviewed.
You can assign the PR to them by writing /assign @woop
in a comment when ready.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve
in a comment
Approvers can cancel approval by writing /approve cancel
in a comment
hmmm.. this still doesn't quite solve the problem as it's unclear how user would control how to call the Trino class
@zyxue have you seen how snowflake does this? I think it has some logic to get a dictionary of arguments from the feature_store.yaml and pass it into the method that creates the connection object.
https://github.com/feast-dev/feast/blob/3dd779f4038a757bcf0b935626605ee1e7f04bf6/sdk/python/feast/infra/utils/snowflake_utils.py#L46
Thank you for comment, @achals . the thing is that the Connection
object in trino https://github.com/trinodb/trino-python-client/blob/master/trino/dbapi.py#L93 supports many args, and not all from can be specified in yaml, e.g. auth
is supposed to be a class object. Will the snowflake/feature_store.yaml
's way still apply?
@zyxue Good point. Creating custom objects would be possible (by passing in nested arguments in the feature_store.yaml
, and configuring how the nested objects (like auth) would be created in some kind of get_trino_conn
method that parsed this config information.
An alternative approach would requiring refactoring how the offline store is instantated to allow for some custom logic in the creation of the offline store connection.. Which will require some more thought and probably an RFC.
Codecov Report
Merging #2820 (0e16e78) into master (33141f8) will decrease coverage by
20.83%
. The diff coverage is14.28%
.
@@ Coverage Diff @@
## master #2820 +/- ##
===========================================
- Coverage 80.45% 59.61% -20.84%
===========================================
Files 173 174 +1
Lines 15254 15497 +243
===========================================
- Hits 12273 9239 -3034
- Misses 2981 6258 +3277
Flag | Coverage Δ | |
---|---|---|
integrationtests | ? |
|
unittests | 59.61% <14.28%> (-0.09%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
Impacted Files | Coverage Δ | |
---|---|---|
...tores/contrib/trino_offline_store/trino_queries.py | 43.95% <5.00%> (-0.13%) |
:arrow_down: |
...stores/contrib/trino_offline_store/trino_source.py | 54.73% <37.50%> (-1.45%) |
:arrow_down: |
.../integration/online_store/test_online_retrieval.py | 16.84% <0.00%> (-80.00%) |
:arrow_down: |
...fline_store/test_universal_historical_retrieval.py | 23.80% <0.00%> (-76.20%) |
:arrow_down: |
sdk/python/tests/utils/online_read_write_test.py | 19.35% <0.00%> (-74.20%) |
:arrow_down: |
.../integration/online_store/test_universal_online.py | 17.32% <0.00%> (-73.14%) |
:arrow_down: |
...ests/integration/e2e/test_python_feature_server.py | 28.35% <0.00%> (-71.65%) |
:arrow_down: |
...dk/python/tests/integration/e2e/test_validation.py | 27.77% <0.00%> (-71.43%) |
:arrow_down: |
sdk/python/feast/wait.py | 23.52% <0.00%> (-70.59%) |
:arrow_down: |
...ion/registration/test_stream_feature_view_apply.py | 28.84% <0.00%> (-69.20%) |
:arrow_down: |
... and 86 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 33141f8...0e16e78. Read the comment docs.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.