seatunnel icon indicating copy to clipboard operation
seatunnel copied to clipboard

[Improve][Zeta] Disable hdfs filesystem cache of checkpoint

Open LeonYoah opened this issue 1 year ago • 3 comments

…cache function is disabled by default.

Purpose of this pull request

Does this PR introduce any user-facing change?

When using hadoop-aws-3.1.4.jar and aws-java-sdk-bundle-1.11.271.jarto connect hdfs or s3 file systems, the default mode is cache. In multithreaded scenarios, FileSysyem objects are often closed, resulting in the closure of the connection pool. If the objects are taken from the cache, some unknown exceptions will be caused

How was this patch tested?

no

Check list

  • [ ] If any new Jar binary package adding in your PR, please add License Notice according New License Guide
  • [ ] If necessary, please update the documentation to describe the new feature. https://github.com/apache/seatunnel/tree/dev/docs
  • [ ] If you are contributing the connector code, please check that the following files are updated:
    1. Update change log that in connector document. For more details you can refer to connector-v2
    2. Update plugin-mapping.properties and add new connector information in it
    3. Update the pom file of seatunnel-dist
  • [ ] Update the release-note.

LeonYoah avatar Apr 16 '24 10:04 LeonYoah

I don't have an oss environment, so I can't run oss unit tests: image

LeonYoah avatar Apr 16 '24 10:04 LeonYoah

Could you add a test case to cover this bug?

Hisoka-X avatar Apr 28 '24 02:04 Hisoka-X

Could you add a test case to cover this bug?

Currently, there is no bug for connection closure in hdfs cache, so it is a precaution in advance.

LeonYoah avatar Apr 28 '24 02:04 LeonYoah