feast icon indicating copy to clipboard operation
feast copied to clipboard

Azure MSSQL Backend (and possibly others) Broken SQLAlchemy `Engine` objects interactions with `pandas.read_sql()`

Open peter-resnick opened this issue 10 months ago • 2 comments

Expected Behavior

feast cli commands like feast plan work when using a mssql offline store

Current Behavior

Currently, repos with mssql offline store are broken when appying feast cli commands. The following occurs:

  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/feast/repo_operations.py", line 218, in plan
    data_source.validate(store.config)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/feast/infra/offline_stores/contrib/mssql_offline_store/mssqlserver_source.py", line 215, in validate
    self.get_table_column_names_and_types(config)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/feast/infra/offline_stores/contrib/mssql_offline_store/mssqlserver_source.py", line 243, in get_table_column_names_and_types
    table_schema = pandas.read_sql(columns_query, conn)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/io/sql.py", line 706, in read_sql
    return pandas_sql.read_query(
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/io/sql.py", line 2736, in read_query
    cursor = self.execute(sql, params)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/io/sql.py", line 2670, in execute
    cur = self.con.cursor()
AttributeError: 'Engine' object has no attribute 'cursor'

Steps to reproduce

with a feature store YAML of:

project: myproject
provider: azure
registry:
  registry_type: sql
  path: ${SQL_REGISTRY_CONNECTION_STRING}
offline_store: 
  type: mssql
  connection_string: ${SQL_OFFLINE_CONNECTION_STRING}
online_store:
  type: redis
  redis_type: redis_cluster
  connection_string: ${REDIS_CONNECTION_STRING}
entity_key_serialization_version: 2

when we run feast plan - we get the above errors.

This may impact other providers, but I've only confirmed it with azure provider

Specifications

  • Version: 0.35.0
  • Platform: Mac + Linux (probably others)
  • Subsystem:

Possible Solution

I think this is a really a SQLAlchemy versioning issue, I resolved the problem by pinning SQLAlchemy<1.4.52

If we just update the AZURE_REQUIRED to:

AZURE_REQUIRED = [
    "azure-storage-blob>=0.37.0",
    "azure-identity>=1.6.1",
    "SQLAlchemy>=1.4.19,<1.4.52",
    "pyodbc>=4.0.30",
    "pymssql",
] 

I think we'd avoid this issue

peter-resnick avatar Apr 02 '24 18:04 peter-resnick

I can make a PR for this when I get a chance

peter-resnick avatar Apr 02 '24 18:04 peter-resnick

@peter-resnick mssql is still not part of ci, so unfortunately these sort of bugs may occur. I don't think pinning sqlalchemy is an option though, we recently merged an upgrade to 2.x #4065

tokoko avatar Apr 04 '24 05:04 tokoko