databricks-sql-python
databricks-sql-python copied to clipboard
Consider making pandas an optional dependency
Description
Currently pandas is a hard requirement for the library, however not necessarily used.
By delaying pandas import in src/databricks/sql/client.py from the top of the file to _convert_arrow_table method after the if self.connection.disable_pandas is True: condition it would be possible to make pandas an optional dependency.
I was able to run all the tests with this change flawlessly, as well as uninstalling pandas and then run the following:
import polars as pl
from databricks.sql.client import Connection
credentials = ...
query = ...
conn = Connection(
server_hostname=credentials["server_hostname"],
http_path=credentials["http_path"],
access_token=credentials["access_token"],
)
dframe = pl.read_database(query, conn)
without any issue
Adding to this - the size of pandas makes it really hard to run this in e.g. an AWS Lambda environment where total deployment package size is limited.