Failure on creation of a new Presto dataset due to failure to load columns
We have a Presto database connection working fine with virtual datasets (added throw SQLLab). When trying to add a new dataset of a physical table we're getting an error.
Reproducing the bug:
- Go to Datasets
- Click on '+ Dataset'
- Select the a Presto DB
- Select the Schema
- Select the Table
- See error
Expected results
See the table columns on the right pane
Actual results
An error message appears on the right pane: "An Error Occurred Unable to load columns for the selected table. Please select a different table."
Couldn't find any errors in the Superset app container log or any other container.
Screenshots
Environment
- Google Chrome, Version 119.0.6045.123
- Superset version: 3.0.1
- python version: 3.11.4
- node.js version: 20.9.0
- any feature flags active: none
Checklist
- [+ ] I have checked the superset logs for python stacktraces and included it here as text if there are any.
- [ +] I have reproduced the issue with at least the latest released version of superset.
- [ +] I have checked the issue tracker for the same issue and I haven't found one similar.
Additional context
Possibly a duplicate of https://github.com/apache/superset/issues/23982, do you have that sort of datetime partition / can you try a table without that to isolate the issue? Specifically the comment https://github.com/apache/superset/issues/23982#issuecomment-1551430633 has a fix you could try.
@bkyryliuk you are listed as a Presto user in the database rolodex - do you have thoughts on this or that issue I linked?
Possibly a duplicate of #23982, do you have that sort of datetime partition / can you try a table without that to isolate the issue? Specifically the comment #23982 (comment) has a fix you could try.
@sfirke Thank for your comment but I don't think it's the same problem. It happens for all tables. The attached tables has only "string" columns.
I was able to debug the problem and found that it was caused by the stringification of the data type at _create_column_info function. It expects to get a column type object and not a string representation of it. Once changed I managed to get the table columns and create the dataset.
Nice problem solving! Thanks for the update. Are you able to propose a fix to the codebase and send a pull request?
Nice problem solving! Thanks for the update. Are you able to propose a fix to the codebase and send a pull request?
Thanks. Sure, I'll send the PR.
@eyalsh99 what you change to fix the issue?
@eyalsh99 what you change to fix the issue?
You should go to superset/db_engine_specs/presto.py and in _create_column_info function do the following change: "type": f"{data_type}", ==> "type": data_type,
The function is expected to return the native column type and not a stringify version of it.
@eyalsh99 are you still planning to open a PR for that change (and maybe add a test if we're lucky?)
@eyalsh99 I am facing the same problem using Athena, maybe you know a solution for my case?
Someone in Slack just reported experiencing this in Athena, too. Same issue or different?
@sfirke Same issues but on Athena, not Presto. And athena.py doesn't have _create_column_info function
@eyalsh99 are you still planning to open a PR for that change (and maybe add a test if we're lucky?)
Apologies for the late response. Submitted the PR: #28305
Re-opening because the fix was reverted in https://github.com/apache/superset/pull/28613.