superset icon indicating copy to clipboard operation
superset copied to clipboard

"SEE TABLE SCHEMA" queries to trino remain "FINISHING" when selecting iceberg tables

Open maxgruber19 opened this issue 3 months ago • 9 comments

Bug description

We found a very ugly issue when connecting superset (version 4.0.2) to trino (476). A user has reported that superset is very slow and no query is processed sometimes. We found out that superset does not finish its queries when somebody selects a table (iceberg via trino) in the "SEE TABLE SCHEMA" dropdown. Combined with our resource pools (users can submit only 5 queries at a time, 6th will be queued) that's definitely a customer facing problem for us, so I'm seeking for help.

Image

3 Queries to trino are fired from superset which are all the same. That's a thing I don't understand as well. Why 3 times the same query?

SELECT * FROM default."tablename$partitions"
Image

Superset already gets a result after couple of seconds and displays the table schema but the query stays in state "FINISHING" until a timeout of ~5mins is hit (which is the default query.client.timeout of trino). The query then is abandoned by trino itself.

io.trino.spi.TrinoException: Query 20250828_082847_00409_ksycd was abandoned by the client, as it may have exited or stopped checking for query results. Query results have not been accessed since 2025-08-28T08:28:49.874Z: currentTime 2025-08-28T08:33:50.786Z
	at io.trino.execution.QueryTracker.failAbandonedQueries(QueryTracker.java:275)
	at io.trino.execution.QueryTracker.lambda$start$0(QueryTracker.java:83)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
	at java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)

The tables have a bigger amount of partitions, we didn't manage to optimize them yet. Some of the tables have ~10.000 partitions (there are ugly reasons for that). Maybe that's relevant, but I don't think so because superset displays the schema just in time - it feels like just a thing of non terminating queries. According to trino, superset fetches ~10000 rows and 5mb from those queries within seconds.

We already checked with cilium-hubble for outgoing traffic from superset to trino while the query is still listed as FINISHING but it seems like the query becomes a zombie right after superset dissplays the table schema.

Please let me know if I can provide any further details

Screenshots/recordings

No response

Superset version

4.0.2

Python version

3.9

Node version

I don't know

Browser

Not applicable

Additional context

No response

Checklist

  • [x] I have searched Superset docs and Slack and didn't find a solution to my problem.
  • [x] I have searched the GitHub issue tracker and didn't find a similar bug report.
  • [x] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.

maxgruber19 avatar Sep 02 '25 19:09 maxgruber19

Hi @maxgruber19! I'm Dosu and I’m helping the superset team.

This behavior happens because Superset, when introspecting Trino/Iceberg tables for schema info, issues partition metadata queries like SELECT * FROM ...$partitions as part of its engine-specific metadata fetching logic. The reason you see three identical queries is that Superset's schema introspection calls several metadata routines in sequence (such as fetching columns, indexes, and keys), each of which may trigger the same expensive partition query via SQLAlchemy reflection and the Trino engine spec. This is not unique to your setup and is a side effect of how Superset gathers table metadata for complex backends like Trino/Iceberg[^1].

The lingering "FINISHING" state in Trino is due to Superset fetching the results and then closing the connection without explicitly closing or cancelling the Trino query. Trino expects the client to either fetch all results or explicitly close the query; otherwise, it keeps the query alive until its timeout is reached. Superset relies on SQLAlchemy/DBAPI context managers for cleanup, which may not force a query kill on the Trino side, especially for large result sets or metadata queries[^1].

There is currently no explicit mechanism in Superset to close or cancel these metadata queries after schema introspection, and no recent changes in the codebase address this. The issue is more pronounced with large-partition Iceberg tables, as the partition metadata queries are expensive and can quickly exhaust Trino's resource pools if not cleaned up promptly[^1].

If this is causing operational issues, you might consider:

  • Reducing the number of partitions in your Iceberg tables if possible.
  • Increasing Trino's resource pool limits or lowering the query timeout.
  • Exploring if you can patch Superset's Trino engine spec to avoid redundant partition queries or to explicitly cancel them after fetching metadata.

[^1]: Superset TrinoEngineSpec and metadata query handling

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Chat with Dosu Join Discord Share on X

dosubot[bot] avatar Sep 02 '25 19:09 dosubot[bot]

@maxgruber19 do you get the same issues with Superset 4.1.X or 5.0.X?

sadpandajoe avatar Sep 03 '25 17:09 sadpandajoe

@sadpandajoe I'll try upgrading the deployment and come back here again, thank you.

I was scanning the changelogs but didn't find anything related to my issue. Are there relevant changes you have in mind?

maxgruber19 avatar Sep 03 '25 19:09 maxgruber19

@sadpandajoe I upgraded to 4.1.2 and the issue still remains unchanged. I cannot upgrade to 5.0.X unfortunately because of internal reasons. Queries are still abandoned after query.client.timeout

I was able to gather some further logs (see below) from the 4.1.2 deployment in the seconds after i select an iceberg table in the "SEE TABLE SCHEMA" dropdown. The GET Requests end immediately after the table information is displayed in the ui which looks to me like superset really just doesn't finish its query to trino. It does not look like a performance issue to me tbh.

I'm also still wondering why the same query is issued 3 times at once, can you tell more on that maybe?

2025-09-04 13:13:18,783:DEBUG:superset.models.core:Database._get_sqla_engine(). Masked URL: trino://trino:443/catalogname/default
2025-09-04 13:13:18,788:DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): trino:443
2025-09-04 13:13:18,820:DEBUG:urllib3.connectionpool:https://trino:443 "POST /v1/statement HTTP/1.1" 200 389
2025-09-04 13:13:18,826:DEBUG:superset.models.core:Database._get_sqla_engine(). Masked URL: trino://trino:443/catalogname/default
2025-09-04 13:13:18,830:DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): trino:443
2025-09-04 13:13:18,833:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/queued/20250904_131318_00830_dzkrm/y395bdfc547e3d2ae7e76e33fa7e17dcc2c0397bb/1 HTTP/1.1" 200 394
2025-09-04 13:13:18,855:DEBUG:urllib3.connectionpool:https://trino:443 "POST /v1/statement HTTP/1.1" 200 389
2025-09-04 13:13:18,872:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/queued/20250904_131318_00831_dzkrm/y775c99499db2736f5ffabf35ebf25c52ab3e98db/1 HTTP/1.1" 200 388
2025-09-04 13:13:19,088:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/queued/20250904_131318_00830_dzkrm/yfcffcc510e3a9535dfc702b993fbb3f29605112a/2 HTTP/1.1" 200 403
2025-09-04 13:13:19,090:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/queued/20250904_131318_00831_dzkrm/y57cdffc735e8d5ed3a987b110afde8ade1e29aa6/2 HTTP/1.1" 200 397
2025-09-04 13:13:19,112:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131318_00831_dzkrm/y8edf65900f9344bd35a15bb050b4023971a24724/0 HTTP/1.1" 200 566
2025-09-04 13:13:19,114:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131318_00830_dzkrm/ybff9c9203eb0f31a9698f639261b17d197278fa2/0 HTTP/1.1" 200 569
2025-09-04 13:13:19,402:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131318_00830_dzkrm/y6a598db559601f2f7744d81336f0c50ff2232a8b/1 HTTP/1.1" 200 611
2025-09-04 13:13:19,407:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131318_00830_dzkrm/ya08d9cf536955649d0b48cf0775b892d2a21dc01/2 HTTP/1.1" 200 533
2025-09-04 13:13:19,411:DEBUG:urllib3.connectionpool:https://trino:443 "POST /v1/statement HTTP/1.1" 200 390
2025-09-04 13:13:19,415:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131318_00831_dzkrm/y756507f948dceb95e69a9bcf471a9d4cc603e31a/1 HTTP/1.1" 200 607
2025-09-04 13:13:19,419:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131318_00831_dzkrm/yfd24f6ea21aa0f61cbb470ed8059d00bb06cf9b0/2 HTTP/1.1" 200 532
2025-09-04 13:13:19,420:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/queued/20250904_131319_00832_dzkrm/yba920663c7e80348b97b546d3c6089515decd5cc/1 HTTP/1.1" 200 389
2025-09-04 13:13:19,422:DEBUG:urllib3.connectionpool:https://trino:443 "POST /v1/statement HTTP/1.1" 200 389
2025-09-04 13:13:19,430:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/queued/20250904_131319_00833_dzkrm/y3c62b21c9fb69cd5f7ea5bdbd5eba28287a4b627/1 HTTP/1.1" 200 389
2025-09-04 13:13:19,434:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/queued/20250904_131319_00832_dzkrm/y7f2d62c2a4c6178c0e72457789074a2d7cedcb68/2 HTTP/1.1" 200 397
2025-09-04 13:13:19,461:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131319_00832_dzkrm/ycbd31b0efea428f81cbef075b010f094f217b3ad/0 HTTP/1.1" 200 657
2025-09-04 13:13:19,598:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/queued/20250904_131319_00833_dzkrm/ye6f11d1aff4031a22410d5c048d57b8eb3e84936/2 HTTP/1.1" 200 398
2025-09-04 13:13:19,606:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131319_00833_dzkrm/y9c4976ee5c1b3c0a0833747442277acdb0290f7b/0 HTTP/1.1" 200 None
2025-09-04 13:13:19,623:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131319_00832_dzkrm/yb0ea7146b3639bcad586c279afa9e26b2ad86b86/1 HTTP/1.1" 200 1018
2025-09-04 13:13:19,627:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131319_00832_dzkrm/yad024e7d36e4d53ec42faee793d2925581e244ea/2 HTTP/1.1" 200 631
2025-09-04 13:13:19,632:DEBUG:superset.models.core:Database._get_sqla_engine(). Masked URL: trino://trino:443/catalogname/default
2025-09-04 13:13:19,635:DEBUG:superset.models.core:Database._get_sqla_engine(). Masked URL: trino://trino:443/catalogname/default
2025-09-04 13:13:19,638:DEBUG:superset.models.core:Database._get_sqla_engine(). Masked URL: trino://trino:443/catalogname/default
2025-09-04 13:13:19,642:DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): trino:443
2025-09-04 13:13:19,666:DEBUG:urllib3.connectionpool:https://trino:443 "POST /v1/statement HTTP/1.1" 200 390
2025-09-04 13:13:19,676:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/queued/20250904_131319_00834_dzkrm/y1d840c17e7317c42567fbe63e80fc455c9dbfdb4/1 HTTP/1.1" 200 391
2025-09-04 13:13:19,684:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/queued/20250904_131319_00834_dzkrm/y82fcb08b8c757a8bf110696172819b98ce936a3c/2 HTTP/1.1" 200 396
2025-09-04 13:13:19,696:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131319_00834_dzkrm/y5569cdd0a55225e7cea702428173aa040df17a6e/0 HTTP/1.1" 200 563
2025-09-04 13:13:19,850:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131319_00834_dzkrm/y2a650bbe9bd425147b9dba9139c3ba5309253b05/1 HTTP/1.1" 200 607
2025-09-04 13:13:19,855:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131319_00834_dzkrm/y6927ae53c999fe145fa8f457df8640eb325aecf0/2 HTTP/1.1" 200 531
2025-09-04 13:13:19,859:DEBUG:urllib3.connectionpool:https://trino:443 "POST /v1/statement HTTP/1.1" 200 389
2025-09-04 13:13:19,867:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/queued/20250904_131319_00835_dzkrm/y42787a33343d60fde0e26af0209a883de90f98d4/1 HTTP/1.1" 200 389
2025-09-04 13:13:20,033:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/queued/20250904_131319_00835_dzkrm/y47fa5fb764ba2ee8e0464f7a122f54d1e824eef1/2 HTTP/1.1" 200 401
2025-09-04 13:13:20,041:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131319_00835_dzkrm/yc0f48bf33db1f59ab7f6f49c5789f21be37bf661/0 HTTP/1.1" 200 None
2025-09-04 13:13:20,613:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131319_00833_dzkrm/ydc4e7b307686065a44ca52e6b37bcf167ef57eb6/1 HTTP/1.1" 200 None
2025-09-04 13:13:21,048:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131319_00835_dzkrm/y61bd79c96683e1871ae18de8121735284adfb449/1 HTTP/1.1" 200 None
2025-09-04 13:13:21,620:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131319_00833_dzkrm/ya1bf470de54d7794d6272a215747592665b647f0/2 HTTP/1.1" 200 None
2025-09-04 13:13:22,054:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131319_00835_dzkrm/y5b5c2c98b55107856b612d833db59eb20effb2d3/2 HTTP/1.1" 200 None
2025-09-04 13:13:22,626:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131319_00833_dzkrm/y8af88e08341bd9b55adf1f57cc2525e4ef0e04c4/3 HTTP/1.1" 200 None
2025-09-04 13:13:22,700:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131319_00833_dzkrm/y0ee69165612d046661ebb845cc430dc65e3d3a1d/4 HTTP/1.1" 200 None
2025-09-04 13:13:22,956:DEBUG:superset.models.core:Database._get_sqla_engine(). Masked URL: trino://trino:443/catalogname/default
2025-09-04 13:13:22,959:DEBUG:superset.models.core:Database._get_sqla_engine(). Masked URL: trino://trino:443/catalogname/default
2025-09-04 13:13:22,960:DEBUG:superset.sql_parse:Parsing with sqlparse statement: SELECT * FROM default."tablename$partitions"

ORDER BY partition DESC, record_count DESC, file_count DESC, total_size DESC, data DESC
LIMIT 1

2025-09-04 13:13:22,962:DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): trino:443
2025-09-04 13:13:22,987:DEBUG:urllib3.connectionpool:https://trino:443 "POST /v1/statement HTTP/1.1" 200 390
2025-09-04 13:13:22,997:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/queued/20250904_131322_00836_dzkrm/y5e4a47795a8a66d2e01671a95c7c68225d6251e6/1 HTTP/1.1" 200 390
2025-09-04 13:13:23,001:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131319_00835_dzkrm/yac2ffea463716f319973445c989e03f5364eb9bb/3 HTTP/1.1" 200 None
2025-09-04 13:13:23,432:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/queued/20250904_131322_00836_dzkrm/y686acb976cb0e51413314e6fd1c8103e2275e377/2 HTTP/1.1" 200 400
2025-09-04 13:13:23,605:DEBUG:superset.models.core:Database._get_sqla_engine(). Masked URL: trino://trino:443/catalogname/default
2025-09-04 13:13:23,608:DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): trino:443
2025-09-04 13:13:23,629:DEBUG:urllib3.connectionpool:https://trino:443 "POST /v1/statement HTTP/1.1" 200 390
2025-09-04 13:13:23,637:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/queued/20250904_131323_00837_dzkrm/y8fdacb009c66a2f32629c88c136ec55495c59218/1 HTTP/1.1" 200 390
2025-09-04 13:13:23,645:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/queued/20250904_131323_00837_dzkrm/y03aab984d508f845cea179dd1811f3f3742f5c2a/2 HTTP/1.1" 200 395
2025-09-04 13:13:23,658:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131323_00837_dzkrm/y7cf91e057019ea26eb73af13c17efae82d3e4d56/0 HTTP/1.1" 200 561
2025-09-04 13:13:23,833:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131323_00837_dzkrm/y2e4f0b8549e594b2d319dac0fb3f04f428d88c12/1 HTTP/1.1" 200 601
2025-09-04 13:13:23,840:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131323_00837_dzkrm/y572e6a9c5863a188503f513364ba098295c114ef/2 HTTP/1.1" 200 531
2025-09-04 13:13:23,844:DEBUG:superset.models.core:Database._get_sqla_engine(). Masked URL: trino://trino:443/catalogname/default
2025-09-04 13:13:23,847:DEBUG:superset.models.core:Database._get_sqla_engine(). Masked URL: trino://trino:443/catalogname/default
2025-09-04 13:13:23,850:DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): trino:443
2025-09-04 13:13:23,871:DEBUG:urllib3.connectionpool:https://trino:443 "POST /v1/statement HTTP/1.1" 200 389
2025-09-04 13:13:23,879:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/queued/20250904_131323_00838_dzkrm/y5eac8344d1414700f42e0f08b35c6f15eb1d70d0/1 HTTP/1.1" 200 390
2025-09-04 13:13:23,887:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/queued/20250904_131323_00838_dzkrm/yc6efef58f457fd21ac2168063fbb1914067fa370/2 HTTP/1.1" 200 396
2025-09-04 13:13:23,899:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131323_00838_dzkrm/y0890392f07dc05bb6b3ac2842a7c5b5400b8d5de/0 HTTP/1.1" 200 563
2025-09-04 13:13:24,056:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131323_00838_dzkrm/y62b444f1f802be44203dcd26d658a88fdc2ef72f/1 HTTP/1.1" 200 606
2025-09-04 13:13:24,067:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131323_00838_dzkrm/y40a3b88bee250c52477ad330d17b52df9e2a415b/2 HTTP/1.1" 200 531
2025-09-04 13:13:24,071:DEBUG:urllib3.connectionpool:https://trino:443 "POST /v1/statement HTTP/1.1" 200 389
2025-09-04 13:13:24,079:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/queued/20250904_131324_00839_dzkrm/y20d06d0939720eabcc053a712c92c84b7fcce9fa/1 HTTP/1.1" 200 391
2025-09-04 13:13:24,244:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/queued/20250904_131324_00839_dzkrm/y90d2196b5698c16ac18172b9b0684a83551c95ff/2 HTTP/1.1" 200 402
2025-09-04 13:13:24,467:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131322_00836_dzkrm/y89e832115bc78ad200dfbe28c261cfefcd648459/0 HTTP/1.1" 200 None
2025-09-04 13:13:25,258:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131324_00839_dzkrm/y126c73227575881bffc78c578fbd07c11874264c/0 HTTP/1.1" 200 None
2025-09-04 13:13:25,474:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131322_00836_dzkrm/y1b9a91a9bfd17dceb2d84f123f75339cdb3ea8f7/1 HTTP/1.1" 200 None
2025-09-04 13:13:26,265:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131324_00839_dzkrm/yd511451a790deebb5a16ddaf06ba19c387b8c2ea/1 HTTP/1.1" 200 None
2025-09-04 13:13:26,480:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131322_00836_dzkrm/yd67733df22e30d69cc43315caa8ab66dd4f0b4d7/2 HTTP/1.1" 200 None
2025-09-04 13:13:26,521:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131322_00836_dzkrm/yd1cb66bf048427eb63ef00ad7f467b00eaee9c5d/3 HTTP/1.1" 200 None
2025-09-04 13:13:26,527:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131322_00836_dzkrm/y996c13cdc8d6001a3b5d6956cdc756fa755e092c/4 HTTP/1.1" 200 None
2025-09-04 13:13:26,558:DEBUG:superset.models.core:Database._get_sqla_engine(). Masked URL: trino://trino:443/catalogname/default
2025-09-04 13:13:26,561:DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): trino:443
2025-09-04 13:13:26,583:DEBUG:urllib3.connectionpool:https://trino:443 "POST /v1/statement HTTP/1.1" 200 387
2025-09-04 13:13:26,593:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/queued/20250904_131326_00840_dzkrm/yca6d0fe2f0840cc00a7fe95b095bbb847da86d03/1 HTTP/1.1" 200 390
2025-09-04 13:13:26,601:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/queued/20250904_131326_00840_dzkrm/y7f407fc26cef796827651dea9a4289cb9d8a8ed8/2 HTTP/1.1" 200 400
2025-09-04 13:13:26,613:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131326_00840_dzkrm/yb6684b1436396c37d5f128cdd8094820f288ca35/0 HTTP/1.1" 200 562
2025-09-04 13:13:26,644:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131326_00840_dzkrm/y98be81e11603dce68f836efdb8f8ed3fb92e35e5/1 HTTP/1.1" 200 657
2025-09-04 13:13:26,648:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131326_00840_dzkrm/y8dc868d29f508cbcebafe72d2e04fd3861a91634/2 HTTP/1.1" 200 531
2025-09-04 13:13:27,171:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131324_00839_dzkrm/yeac8a70ee3896769e74b1592502b7e20f38bb288/2 HTTP/1.1" 200 None
2025-09-04 13:13:27,763:DEBUG:superset.models.core:Database._get_sqla_engine(). Masked URL: trino://trino:443/catalogname/default
2025-09-04 13:13:27,765:DEBUG:superset.models.core:Database._get_sqla_engine(). Masked URL: trino://trino:443/catalogname/default
2025-09-04 13:13:27,766:DEBUG:superset.sql_parse:Parsing with sqlparse statement: SELECT * FROM default."tablename$partitions"

ORDER BY partition DESC, record_count DESC, file_count DESC, total_size DESC, data DESC
LIMIT 1

2025-09-04 13:13:27,769:DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): trino:443
2025-09-04 13:13:27,791:DEBUG:urllib3.connectionpool:https://trino:443 "POST /v1/statement HTTP/1.1" 200 389
2025-09-04 13:13:27,800:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/queued/20250904_131327_00841_dzkrm/y6338400f0c120c59391935eafcaeeb8012868b1f/1 HTTP/1.1" 200 393
2025-09-04 13:13:27,974:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/queued/20250904_131327_00841_dzkrm/y87f892f84f13cf56c6d4974b2822ea3dd614db54/2 HTTP/1.1" 200 403
2025-09-04 13:13:27,987:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131327_00841_dzkrm/y67a7851426f370f26cd1e25198eba4cb8b922375/0 HTTP/1.1" 200 None
2025-09-04 13:13:28,993:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131327_00841_dzkrm/y020990749c22ad03a2d9b871fd7afa76e32feec4/1 HTTP/1.1" 200 None
2025-09-04 13:13:30,000:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131327_00841_dzkrm/y60130ed606beeb5bb46e03fe070e89edeb5c1fd8/2 HTTP/1.1" 200 None
2025-09-04 13:13:31,006:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131327_00841_dzkrm/y65f16b30eab2c6fee3cfc3c00ec5ada7a59438dd/3 HTTP/1.1" 200 None
2025-09-04 13:13:31,302:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131327_00841_dzkrm/y053ce9063b091e9fa8843df5dd597fa30f671932/4 HTTP/1.1" 200 None
2025-09-04 13:13:31,308:DEBUG:urllib3.connectionpool:https://trino:443 "GET /v1/statement/executing/20250904_131327_00841_dzkrm/y3493c33b9a1bb300d021a887e465cecf10a3e99d/5 HTTP/1.1" 200 None
2025-09-04 13:13:31,333:DEBUG:superset.models.core:Database._get_sqla_engine(). Masked URL: trino://trino:443/catalogname

maxgruber19 avatar Sep 04 '25 12:09 maxgruber19

CC @betodealmeida @villebro in case they have any guesses here.

rusackas avatar Nov 18 '25 00:11 rusackas

@rusackas Not yet, we compacted the tables via trino and got rid of almost every metadata and the problem has been "solved" by that. But I still think it would be a much better way to complete queries when the partition information has been received.

maxgruber19 avatar Nov 18 '25 20:11 maxgruber19

This is likely an issue with the SQLAlchemy dialect and/or DB API 2.0 driver, but I'm curious why we're generating 3 separate queries. I'm taking a look.

betodealmeida avatar Nov 18 '25 20:11 betodealmeida

Good to know and thanks for taking a look at it.

What i see for the first time now after having a second look at my own logs attached above: It seems like superset already tries to fetch only one row but the query is not formatted the right way somehow?

2025-09-04 13:13:27,766:DEBUG:superset.sql_parse:Parsing with sqlparse statement: SELECT * FROM default."tablename$partitions"

ORDER BY partition DESC, record_count DESC, file_count DESC, total_size DESC, data DESC
LIMIT 1

But the query handled by trino is only the first line?

maxgruber19 avatar Nov 18 '25 21:11 maxgruber19

@betodealmeida even after compacting the tables from 15k to ~ 300 partitions the issue still persists. So it seems like its a problem of query handling instead of iceberg tables having to much partitions like assumed beforehand

Did you have a chance to look at that?

maxgruber19 avatar Dec 04 '25 10:12 maxgruber19