feast icon indicating copy to clipboard operation
feast copied to clipboard

feat: Allow extra fields in Snowflake connections

Open michaelneely opened this issue 7 months ago • 9 comments

What this PR does / why we need it:

There are many additional configuration options that can be specified when creating a Snowflake connection, for example: proxies, the session keep alive heartbeat interval, etc..

One concrete use case I've found already is setting the host name when testing Snowflake locally with LocalStack.

This PR allows setting additional configuration options in the feature_store.yaml file by changing the Snowflake Pydantic models to allow extra fields.

Which issue(s) this PR fixes:

Misc

michaelneely avatar Jun 04 '25 09:06 michaelneely

There are a lot of configuration options (see here). I think trying to maintain parity for dozens of options would be a difficult endeavour. Users could easily refer to the Snowflake connector documentation for any parameters considered non-standard.

michaelneely avatar Jun 04 '25 11:06 michaelneely

Could we choose between those are regular in use (So we can add explicit) and those are irregular(comes in with extra), that way we will have some control over important params.

jyejare avatar Jun 04 '25 11:06 jyejare

I think the parameters already explicitly listed in the Snowflake models are considered the "standard" options. Allowing extra parameters increases the flexibility for advanced use cases. Maybe we could mention this in the docs?

michaelneely avatar Jun 04 '25 11:06 michaelneely

Another option would be to add an explicit "connection extras" (not sure on name) parameter that we unpack in the GetSnowflakeConnection class.

michaelneely avatar Jun 04 '25 12:06 michaelneely

@michaelneely Do you have a quick example code how it would looks like ?

jyejare avatar Jun 11 '25 11:06 jyejare

Under the proposed approach of allowing extras, if I wanted to add the host parameter to my snowflake connection, then my feature_store.yml would look like this:

registry:
  registry_type: snowflake.registry
  account: ...
  user: ...
  password: ...
  role: ...
  warehouse: ...
  database: ...
  schema: ...
  host: snowflake.localhost.localstack.cloud
provider: local
offline_store:
    type: snowflake.offline
    account: ...
    user: ...
    password: ...
    role: ...
    warehouse: ...
    database: ...
    schema: ...
    host: snowflake.localhost.localstack.cloud

And it "just works" because the host argument is passed through to the snowflake connector's connect method

Account, user, password, role, warehouse, database, schema are all currently documented parameters for the registry and offline store.

michaelneely avatar Jun 11 '25 12:06 michaelneely

@michaelneely Maybe I did not explicitly mention about the connection extras. Sorry!!

So in above comment I was looking for connection extras method of passing extras. I think the shared example above it using the current extras method.

jyejare avatar Jun 12 '25 10:06 jyejare

Gotcha. I think it would be exactly like above, except host would sit under connection_extras, e.g.,:

registry:
  registry_type: snowflake.registry
  account: ...
  user: ...
  password: ...
  role: ...
  warehouse: ...
  database: ...
  schema: ...
  connection_extras:
    host: snowflake.localhost.localstack.cloud

And I would just update the pydantic model to include connection_extras: Dict[str, Any] and add some code to unpack that dictionary into the snowflake connector's connect() method.

The reason I'm not really a big fan of this approach is that all these parameters (except registry_type) are passed into the snowflake connector's connect() method anyway, so you're writing more code that needs to be maintained for very little gain.

michaelneely avatar Jun 12 '25 11:06 michaelneely

Agree! Seems we are getting into more complex path. Rather lets keep this appraoch 👍

jyejare avatar Jun 13 '25 14:06 jyejare

Thanks, @jyejare, how can I get this merged?

michaelneely avatar Jul 09 '25 10:07 michaelneely