pyprql
pyprql copied to clipboard
Make packages (pandas) optional
I would like to change all dependencies except prqlc to optional so that we can install only what we need like pyprql[pandas] or pyprql[jupyter].
In many cases pandas are no longer needed, especially when polars are added to the dependency by #373.
One problem with using optional packages in pip is that they're quite rarely used — unlike rust. So IME most users aren't familiar with how to install them / that they should check which set of dependencies they want.
We do have prqlc itself if folks only want the compiler.
If there is a large enough use case for splitting the dependencies here, then OK, but otherwise I would leave away from it. Maybe pandas is that? But also most installations would have pandas anyway...
Just as great_tables recently removed both pandas and polars from its required dependencies (technically, it removed pandas, which was once a required dependency, to support polars-only installation), polars users may not want to install pandas.
Currently pyprql pulls pandas and duckdb, while these are completely unnecessary for users who want to use only polars.
As like the discussion of making pyarrow a required dependency of pandas seems to be (pandas-dev/pandas#57073), I think packages with huge binaries tend to be shunned.
Certainly jupysql and duckdb are worth keeping for now, but pandas is really unnecessary.
Users who need pandas should already have it installed and prql.pandas_accessor can only be used if an instance of pandas.DataFrame is created using pandas in the first place.
Users who need pandas should already have it installed and
prql.pandas_accessorcan only be used if an instance of pandas.DataFrame is created using pandas in the first place.
This is a very good point! Because this augments pandas' functionality but doesn't otherwise require pandas to work, we could even remove it from the dependencies all together. Then if someone has pandas / polars installed, this library augments that, otherwise it doesn't interfere by installing anything.
(We'd still have them in dev dependencies so tests can run etc. And OK if you prefer to have them as optional dependencies)