feat: Allow extra fields in Snowflake connections
What this PR does / why we need it:
There are many additional configuration options that can be specified when creating a Snowflake connection, for example: proxies, the session keep alive heartbeat interval, etc..
One concrete use case I've found already is setting the host name when testing Snowflake locally with LocalStack.
This PR allows setting additional configuration options in the feature_store.yaml file by changing the Snowflake Pydantic models to allow extra fields.
Which issue(s) this PR fixes:
Misc
There are a lot of configuration options (see here). I think trying to maintain parity for dozens of options would be a difficult endeavour. Users could easily refer to the Snowflake connector documentation for any parameters considered non-standard.
Could we choose between those are regular in use (So we can add explicit) and those are irregular(comes in with extra), that way we will have some control over important params.
I think the parameters already explicitly listed in the Snowflake models are considered the "standard" options. Allowing extra parameters increases the flexibility for advanced use cases. Maybe we could mention this in the docs?
Another option would be to add an explicit "connection extras" (not sure on name) parameter that we unpack in the GetSnowflakeConnection class.
@michaelneely Do you have a quick example code how it would looks like ?
Under the proposed approach of allowing extras, if I wanted to add the host parameter to my snowflake connection, then my feature_store.yml would look like this:
registry:
registry_type: snowflake.registry
account: ...
user: ...
password: ...
role: ...
warehouse: ...
database: ...
schema: ...
host: snowflake.localhost.localstack.cloud
provider: local
offline_store:
type: snowflake.offline
account: ...
user: ...
password: ...
role: ...
warehouse: ...
database: ...
schema: ...
host: snowflake.localhost.localstack.cloud
And it "just works" because the host argument is passed through to the snowflake connector's connect method
Account, user, password, role, warehouse, database, schema are all currently documented parameters for the registry and offline store.
@michaelneely Maybe I did not explicitly mention about the connection extras. Sorry!!
So in above comment I was looking for connection extras method of passing extras. I think the shared example above it using the current extras method.
Gotcha. I think it would be exactly like above, except host would sit under connection_extras, e.g.,:
registry:
registry_type: snowflake.registry
account: ...
user: ...
password: ...
role: ...
warehouse: ...
database: ...
schema: ...
connection_extras:
host: snowflake.localhost.localstack.cloud
And I would just update the pydantic model to include connection_extras: Dict[str, Any] and add some code to unpack that dictionary into the snowflake connector's connect() method.
The reason I'm not really a big fan of this approach is that all these parameters (except registry_type) are passed into the snowflake connector's connect() method anyway, so you're writing more code that needs to be maintained for very little gain.
Agree! Seems we are getting into more complex path. Rather lets keep this appraoch 👍
Thanks, @jyejare, how can I get this merged?