amoro icon indicating copy to clipboard operation
amoro copied to clipboard

[Improvement]: FileIO should be shared across tables within the same catalog

Open liaoyt opened this issue 1 week ago • 0 comments

Search before asking

  • [x] I have searched in the issues and found no similar issues.

What would you like to be improved?

In our long-running ams process we see too many OSS/HTTP connection pools. Heap analysis shows a very large number of org.apache.http.impl.conn.PoolingHttpClientConnectionManager, which suggests Iceberg FileIO (or its underlying HTTP client) is being created per table and not reused. Since FileIO is expected to be thread-safe, we think FileIO (or at least the underlying HTTP/OSS client) should be shared across tables within the same catalog to avoid excessive pools, memory usage, and sockets.

How should we improve?

Add catalog-scoped reuse/caching of FileIO (or its internal HTTP client/connection manager), so tables with the same catalog/config share the same instance and it’s closed only on catalog/service shutdown.

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

Subtasks

No response

Code of Conduct

liaoyt avatar Dec 17 '25 07:12 liaoyt