ibis
ibis copied to clipboard
bug: [Athena] Missing support for CREATE TABLE AS SELECT (CTAS) or expression materialization in Athena backend
What happened?
Summary
The Athena backend in Ibis currently lacks support for materializing Ibis expressions into Athena tables via CREATE TABLE AS SELECT (CTAS), which makes it incompatible with downstream tools like Kedro that rely on Ibis’s Table.save() API.
Context
While the ibis create_table() method exists for Athena backend, it appears to only support defining an empty external table using a schema and location — it does not support saving expressions or CTAS operations.
This limits the backend's ability to support materialization workflows, which are common in ETL pipelines.
Use Case
In a Kedro pipeline, I'm trying to read from a source Athena table and save the transformed result into a new Athena table using Ibis and kedro_datasets.ibis.TableDataset.
Example:
python
from kedro_datasets.ibis import TableDataset
load_dataset = TableDataset(
table_name= src_table_name,
connection={
"backend": "athena",
"s3_staging_dir": "s3://bucket-name/...1",
"schema_name": "schema_name",
},
)
data = load_dataset.load()
save_dataset = TableDataset(
table_name="table_name",
connection={
"backend": "athena",
"s3_staging_dir": "s3://bucket-name/...",
"schema_name": "schema_name",
},
save_args={"materialized": "table", "overwrite": None}
save_dataset.save(data)
Problem:
This fails with: DatasetError: Failed while saving data to dataset TableDataset(...). Under the hood, Kedro + Ibis appears to call create_table(..., obj=<IbisExpr>) — but the Athena backend does not implement CTAS or insert operations, leading to failure after writing intermediate data to S3.
Expected Behavior
The Athena backend should ideally support:
- Materializing an Ibis expression as a table using CREATE TABLE AS SELECT (CTAS)
- Possibly also support insert() or overwrite() operations.
Why It Matters
This feature would enable:
- Full Kedro + Ibis pipelines using Athena
- Avoid fallback to PyAthena or raw DDL/pyarrow + s3fs for intermediate steps
What version of ibis are you using?
ibis-framework - 10.5.0 kedro-datasets - 7.0.0
What backend(s) are you using, if any?
Athena
Relevant log output
Code of Conduct
- [x] I agree to follow this project's Code of Conduct
@uday-dasari Do you have a traceback or something else indicating that CTAS doesn't work? We definitely support this use case, and many of our tests hit the code path for create_table with an existing Ibis expression.