seatunnel icon indicating copy to clipboard operation
seatunnel copied to clipboard

[Feature][Transform] Introduce tika transform

Open liugddx opened this issue 3 months ago • 6 comments

Purpose of this pull request

https://github.com/apache/seatunnel/issues/9861

Does this PR introduce any user-facing change?

How was this patch tested?

Check list

  • [ ] If any new Jar binary package adding in your PR, please add License Notice according New License Guide
  • [ ] If necessary, please update the documentation to describe the new feature. https://github.com/apache/seatunnel/tree/dev/docs
  • [ ] If you are contributing the connector code, please check that the following files are updated:
    1. Update plugin-mapping.properties and add new connector information in it
    2. Update the pom file of seatunnel-dist
    3. Add ci label in label-scope-conf
    4. Add e2e testcase in seatunnel-e2e
    5. Update connector plugin_config

liugddx avatar Sep 15 '25 00:09 liugddx

@corgy-w @zhangshenghang PTAL

liugddx avatar Oct 13 '25 10:10 liugddx

thanks @liugddx . Do we have IT test class that actually read files? For example, reading PDF . It would be best if you could provide it.

zhangshenghang avatar Oct 19 '25 10:10 zhangshenghang

thanks @liugddx . Do we have IT test class that actually read files? For example, reading PDF . It would be best if you could provide it.

+1

Hisoka-X avatar Oct 19 '25 10:10 Hisoka-X

thanks @liugddx . Do we have IT test class that actually read files? For example, reading PDF . It would be best if you could provide it.

+1

Done. @zhangshenghang @Hisoka-X

liugddx avatar Oct 23 '25 05:10 liugddx

LGTM! if ci passes

zhangshenghang avatar Oct 23 '25 13:10 zhangshenghang

@liugddx Can you fix CI for this PR?

davidzollo avatar Dec 04 '25 02:12 davidzollo