datafusion-ballista icon indicating copy to clipboard operation
datafusion-ballista copied to clipboard

[Python] Add support for `df.show()` on `CREATE EXTERNAL TABLE` statements

Open andygrove opened this issue 2 years ago • 1 comments

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

>>> df = ctx.sql("CREATE EXTERNAL TABLE orders STORED AS PARQUET LOCATION '/mnt/bigdata/tpch/sf1-parquet/orders'")
>>> df.show()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
Exception: Arrow error: External error: Execution error: Job hZh0A3x failed: Error planning job hZh0A3x: DataFusionError(Internal("Unsupported logical plan: CreateExternalTable"))

Describe the solution you'd like Make this work

Describe alternatives you've considered None

Additional context None

andygrove avatar Aug 27 '22 17:08 andygrove

It actually does work but we get an error if we try and show the results of the DataFrame. This is not good UX so we will need to fix.

>>> import ballista
>>> ctx = ballista.BallistaContext("localhost", 50050)
>>> ctx.sql("CREATE EXTERNAL TABLE orders STORED AS PARQUET LOCATION '/mnt/bigdata/tpch/sf1-parquet/orders'")
<ballista.DataFrame object at 0x7f501f088150>
>>> df = ctx.sql("SELECT count(*) FROM orders")
>>> df.show()
+-----------------+
| COUNT(UInt8(1)) |
+-----------------+
| 14999999        |
+-----------------+

andygrove avatar Aug 27 '22 18:08 andygrove