Phillip Cloud comments

Results 993 comments of


                                            Phillip Cloud

feat: avoid double fetches on `ibis.duckdb.connect().read_csv("https://slow_url").cache()`

Can you break down the timing of: `CREATE VIEW v AS FROM read_csv('slow_url')` and `CREATE TABLE t AS FROM read_csv('slow_url')` ? It's not clear that there's a "double fetch" (as...

feat: avoid double fetches on `ibis.duckdb.connect().read_csv("https://slow_url").cache()`

I'm sympathetic to problem here, but can we perhaps try to get this performance difference fixed upstream in DuckDB so that we don't have to alter the Ibis API?

feat(pyspark): add mode computation

You can add `ops.Mode: "mode"` to the `SIMPLE_OPS` mapping in the PySpark compiler, then run the test suite with `pytest -m pyspark -k mode` and remove the `"pyspark"` string from...

feat: Support for window functions in Polars backend

Yep, this is an issue. There's some work here https://github.com/ibis-project/ibis/pull/10914, but I haven't had time to go through it fully yet.

feat: support unnest of struct with trino backend

Thanks for the issue. It seems like we're generating incorrect code (technically sqlglot is). It also seems like it's not possible to tell Trino not to flatten structs, which is...

feat: support unnest of struct with trino backend

> I am not sure if it helps, since the output would still be different from other backends: with trino you would have one column per struct field, while with...

feat: support unnest of struct with trino backend

The `sequence` function is also kneecapped at 10k elements so there's really no free lunch here.

[HELP NEEDED] How to insert `RAND(seed)` in `ORDER BY` clause for impala backend?

There are a number of issues with making an API for this. See [here](https://github.com/ibis-project/ibis/issues/8054) for a discussion. In the meantime, for Impala, you should be able to do this: ```...

Add schema option to "read_xlsx"

Thanks for the issue! Do you have a file or some example data that doesn't work as you'd like it to that you could upload here?

Create a method to return a spark df for pyspark backend

A short term workaround would be to compile the SQL and pass that to the `SparkContext.sql` method: ```python expr = ... # an ibis expression spark_sql = expr.compile() spark_df =...