superset icon indicating copy to clipboard operation
superset copied to clipboard

Failure on creation of a new Presto dataset due to failure to load columns

Open eyalsh99 opened this issue 2 years ago • 14 comments

We have a Presto database connection working fine with virtual datasets (added throw SQLLab). When trying to add a new dataset of a physical table we're getting an error.

Reproducing the bug:

  1. Go to Datasets
  2. Click on '+ Dataset'
  3. Select the a Presto DB
  4. Select the Schema
  5. Select the Table
  6. See error

Expected results

See the table columns on the right pane

Actual results

An error message appears on the right pane: "An Error Occurred Unable to load columns for the selected table. Please select a different table."

Couldn't find any errors in the Superset app container log or any other container.

Screenshots

image

Environment

  • Google Chrome, Version 119.0.6045.123
  • Superset version: 3.0.1
  • python version: 3.11.4
  • node.js version: 20.9.0
  • any feature flags active: none

Checklist

  • [+ ] I have checked the superset logs for python stacktraces and included it here as text if there are any.
  • [ +] I have reproduced the issue with at least the latest released version of superset.
  • [ +] I have checked the issue tracker for the same issue and I haven't found one similar.

Additional context

eyalsh99 avatar Nov 12 '23 15:11 eyalsh99

Possibly a duplicate of https://github.com/apache/superset/issues/23982, do you have that sort of datetime partition / can you try a table without that to isolate the issue? Specifically the comment https://github.com/apache/superset/issues/23982#issuecomment-1551430633 has a fix you could try.

sfirke avatar Nov 14 '23 17:11 sfirke

@bkyryliuk you are listed as a Presto user in the database rolodex - do you have thoughts on this or that issue I linked?

sfirke avatar Nov 14 '23 17:11 sfirke

Possibly a duplicate of #23982, do you have that sort of datetime partition / can you try a table without that to isolate the issue? Specifically the comment #23982 (comment) has a fix you could try.

@sfirke Thank for your comment but I don't think it's the same problem. It happens for all tables. The attached tables has only "string" columns.

eyalsh99 avatar Nov 19 '23 17:11 eyalsh99

I was able to debug the problem and found that it was caused by the stringification of the data type at _create_column_info function. It expects to get a column type object and not a string representation of it. Once changed I managed to get the table columns and create the dataset.

eyalsh99 avatar Nov 19 '23 17:11 eyalsh99

Nice problem solving! Thanks for the update. Are you able to propose a fix to the codebase and send a pull request?

sfirke avatar Nov 19 '23 18:11 sfirke

Nice problem solving! Thanks for the update. Are you able to propose a fix to the codebase and send a pull request?

Thanks. Sure, I'll send the PR.

eyalsh99 avatar Nov 20 '23 15:11 eyalsh99

@eyalsh99 what you change to fix the issue?

ameedbakri avatar Feb 05 '24 13:02 ameedbakri

@eyalsh99 what you change to fix the issue?

You should go to superset/db_engine_specs/presto.py and in _create_column_info function do the following change: "type": f"{data_type}", ==> "type": data_type,

The function is expected to return the native column type and not a stringify version of it.

eyalsh99 avatar Feb 07 '24 07:02 eyalsh99

@eyalsh99 are you still planning to open a PR for that change (and maybe add a test if we're lucky?)

rusackas avatar Mar 01 '24 17:03 rusackas

@eyalsh99 I am facing the same problem using Athena, maybe you know a solution for my case?

RyzhkovIlia avatar Apr 19 '24 15:04 RyzhkovIlia

Someone in Slack just reported experiencing this in Athena, too. Same issue or different?

sfirke avatar Apr 26 '24 15:04 sfirke

@sfirke Same issues but on Athena, not Presto. And athena.py doesn't have _create_column_info function

RyzhkovIlia avatar Apr 29 '24 06:04 RyzhkovIlia

@eyalsh99 are you still planning to open a PR for that change (and maybe add a test if we're lucky?)

Apologies for the late response. Submitted the PR: #28305

eyalsh99 avatar May 01 '24 08:05 eyalsh99

Re-opening because the fix was reverted in https://github.com/apache/superset/pull/28613.

john-bodley avatar May 21 '24 18:05 john-bodley