Complex / nested types in Feast
Is your feature request related to a problem? Please describe. There exist several friction points of getting data into Feast:
- Upstream jobs (e.g. ETL jobs) may leave data in more raw formats like a nested format
- Geospatial data can be very high volume, and need spatial indexes + range queries to generate features efficiently.
- Categorical features can be non-trivial to manage. e.g if you have a category id, you technically also would need to have some versioning of the range of ids (/ a vocab size) available at a point in time to make sense of it and convert it into e.g. one-hot / multi-hot embeddings usable by models.
- Due to lack of batch transformations in Feast, it may be more desirable to have more raw data stored, and then have on demand transformations done by downstream clients than to encode + materialize precomputed features
Describe the solution you'd like There are several types of complex types that would be interesting to store in Feast:
- [ ] Nested or semi-structured types (e.g. Struct types in BigQuery, Object types in Snowflake)
- [ ] JSON types
- [ ] Improved embedding support (e.g. including conversions between sparse / dense representations so that you could input a category id (or list of category ids) and get out one-hot / multi-hot embeddings)
- [ ] Geospatial types (e.g. for constructing spatial indexes in the online store)
On demand feature views can then access these naturally once they are natively supported.
Describe alternatives you've considered
- Having more robust batch feature engineering, but this also forces materializing transformations ahead of time. This also forces users today to write them in SQL (unless they use some udfs for a given DWH, or users use a Spark based offline store, etc)
I'll be following this and will come back to post when I have some observations to share.
For the moment my target use-case for this involves feast-postgres and postgres built-in JSON and JSONB column types.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
bump
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
bump
On Sat, May 20, 2023 at 11:37 AM stale[bot] @.***> wrote:
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
— Reply to this email directly, view it on GitHub https://github.com/feast-dev/feast/issues/2294#issuecomment-1555948229, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATSRCU7UM6QBYHHUUJ6EQWTXHDXLZANCNFSM5NX43VYA . You are receiving this because you commented.Message ID: @.***>