datafusion-ballista
datafusion-ballista copied to clipboard
[Python] Add support for `df.show()` on `CREATE EXTERNAL TABLE` statements
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
>>> df = ctx.sql("CREATE EXTERNAL TABLE orders STORED AS PARQUET LOCATION '/mnt/bigdata/tpch/sf1-parquet/orders'")
>>> df.show()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
Exception: Arrow error: External error: Execution error: Job hZh0A3x failed: Error planning job hZh0A3x: DataFusionError(Internal("Unsupported logical plan: CreateExternalTable"))
Describe the solution you'd like Make this work
Describe alternatives you've considered None
Additional context None
It actually does work but we get an error if we try and show the results of the DataFrame. This is not good UX so we will need to fix.
>>> import ballista
>>> ctx = ballista.BallistaContext("localhost", 50050)
>>> ctx.sql("CREATE EXTERNAL TABLE orders STORED AS PARQUET LOCATION '/mnt/bigdata/tpch/sf1-parquet/orders'")
<ballista.DataFrame object at 0x7f501f088150>
>>> df = ctx.sql("SELECT count(*) FROM orders")
>>> df.show()
+-----------------+
| COUNT(UInt8(1)) |
+-----------------+
| 14999999 |
+-----------------+