seatunnel icon indicating copy to clipboard operation
seatunnel copied to clipboard

[Feature][ConnectorV2]add file excel sink

Open Bingz2 opened this issue 2 years ago • 16 comments

Purpose of this pull request

add [File]excel sink https://github.com/apache/incubator-seatunnel/issues/1946

Check list

  • [x] Code changed are covered with tests, or it does not need tests for reason:
  • [ ] If any new Jar binary package adding in your PR, please add License Notice according New License Guide
  • [ ] If necessary, please update the documentation to describe the new feature. https://github.com/apache/incubator-seatunnel/tree/dev/docs

Bingz2 avatar Aug 31 '22 15:08 Bingz2

@TyrantLucifer Hi, PTAL

CalvinKirs avatar Sep 01 '22 00:09 CalvinKirs

image

You e2e test cases also have some problems, please check.

TyrantLucifer avatar Sep 02 '22 05:09 TyrantLucifer

image

You e2e test cases also have some problems, please check.

image If I add this code, Excel sink Write's finishAndCloseWriteFile method will be called twice when executed with Flink, resulting in an error!

However, if you remove this code and run it with Spark, Text sink Write is executed instead of Excel sink Write

Bingz2 avatar Sep 02 '22 13:09 Bingz2

image You e2e test cases also have some problems, please check.

image If I add this code, Excel sink Write's finishAndCloseWriteFile method will be called twice when executed with Flink, resulting in an error!

However, if you remove this code and run it with Spark, Text sink Write is executed instead of Excel sink Write

Can you provide the detailed call stack information?

TyrantLucifer avatar Sep 03 '22 08:09 TyrantLucifer

image You e2e test cases also have some problems, please check.

image If I add this code, Excel sink Write's finishAndCloseWriteFile method will be called twice when executed with Flink, resulting in an error! However, if you remove this code and run it with Spark, Text sink Write is executed instead of Excel sink Write

Can you provide the detailed call stack information?

image image

SparkDataWriter's commit() method clears commitInfo after execution, However, the prepareCommit of FlinkSinkWrite does not clear commitInfo, so it writes ExcelWorkBook when it calls the close() method of FlinkSlinkWrite, but it does the WorkBook when it executes prepareCommit It has been closed. I don't understand why SparkDataWriter and FlinkSinkWrite have different commit logic. Do I need to determine whether the workbook is close in finishAndCloseWriteFile?

Bingz2 avatar Sep 03 '22 09:09 Bingz2

image You e2e test cases also have some problems, please check.

image If I add this code, Excel sink Write's finishAndCloseWriteFile method will be called twice when executed with Flink, resulting in an error! However, if you remove this code and run it with Spark, Text sink Write is executed instead of Excel sink Write

Can you provide the detailed call stack information?

image image

SparkDataWriter's commit() method clears commitInfo after execution, However, the prepareCommit of FlinkSinkWrite does not clear commitInfo, so it writes ExcelWorkBook when it calls the close() method of FlinkSlinkWrite, but it does the WorkBook when it executes prepareCommit It has been closed. I don't understand why SparkDataWriter and FlinkSinkWrite have different commit logic. Do I need to determine whether the workbook is close in finishAndCloseWriteFile?

The reason already in comment: combine the prepareCommit and commit in this method.. Flink Support Committer and GlobalCommitter, but spark only support GlobalCommitter(same logic different name). So spark use commit() to run prepareCommit() and Committer.commit().

Hisoka-X avatar Sep 05 '22 08:09 Hisoka-X

image You e2e test cases also have some problems, please check.

image If I add this code, Excel sink Write's finishAndCloseWriteFile method will be called twice when executed with Flink, resulting in an error! However, if you remove this code and run it with Spark, Text sink Write is executed instead of Excel sink Write

Can you provide the detailed call stack information?

image image SparkDataWriter's commit() method clears commitInfo after execution, However, the prepareCommit of FlinkSinkWrite does not clear commitInfo, so it writes ExcelWorkBook when it calls the close() method of FlinkSlinkWrite, but it does the WorkBook when it executes prepareCommit It has been closed. I don't understand why SparkDataWriter and FlinkSinkWrite have different commit logic. Do I need to determine whether the workbook is close in finishAndCloseWriteFile?

The reason already in comment: combine the prepareCommit and commit in this method.. Flink Support Committer and GlobalCommitter, but spark only support GlobalCommitter(same logic different name). So spark use commit() to run prepareCommit() and Committer.commit().

ok,My local E2E and compile passed, but CI/CD failed and I didn't use ES. image

Bingz2 avatar Sep 05 '22 13:09 Bingz2

Your e2e not passed, please solve the problem. Thanks

Hisoka-X avatar Sep 26 '22 03:09 Hisoka-X

Your e2e not passed, please solve the problem. Thanks

When using SXSSFWorkbook to create a sheet,local operation is normal, but E2E will report an error. using openjdk will report a null pointer exception because there is no font. reference: https://blog.csdn.net/progammer10086/article/details/107154814?spm=1001.2101.3001.6650.3&utm_medium=distribute.pc_relevant.none-task-blog-2%7Edefault%7ECTRLIST%7ERate-3-107154814-blog-116048731.t0_edu_mlt&depth_1-utm_source=distribute.pc_relevant.none-task-blog-2%7Edefault%7ECTRLIST%7ERate-3-107154814-blog-116048731.t0_edu_mlt&utm_relevant_index=6

Bingz2 avatar Sep 26 '22 06:09 Bingz2

Your e2e not passed, please solve the problem. Thanks

When using SXSSFWorkbook to create a sheet,local operation is normal, but E2E will report an error. using openjdk will report a null pointer exception because there is no font. reference: https://blog.csdn.net/progammer10086/article/details/107154814?spm=1001.2101.3001.6650.3&utm_medium=distribute.pc_relevant.none-task-blog-2%7Edefault%7ECTRLIST%7ERate-3-107154814-blog-116048731.t0_edu_mlt&depth_1-utm_source=distribute.pc_relevant.none-task-blog-2%7Edefault%7ECTRLIST%7ERate-3-107154814-blog-116048731.t0_edu_mlt&utm_relevant_index=6

Can you install font in docker before you submit job use executeExtraCommands? And add doc to tell users if they use openjdk, what should them do.

Hisoka-X avatar Sep 26 '22 06:09 Hisoka-X

Hi, @Bingz2 thanks for your contribution, what's the news about this pr? Do you need some help?

EricJoy2048 avatar Oct 13 '22 05:10 EricJoy2048

Hi, @Bingz2 thanks for your contribution, what's the news about this pr? Do you need some help? Sorry, I'm a little busy recently. I'll finish it this weekend

Bingz2 avatar Oct 14 '22 01:10 Bingz2

Hi, @Bingz2 thanks for your contribution, what's the news about this pr? Do you need some help? Sorry, I'm a little busy recently. I'll finish it this weekend

Thank you very much.

EricJoy2048 avatar Oct 15 '22 10:10 EricJoy2048

@TyrantLucifer PTAL

EricJoy2048 avatar Oct 16 '22 14:10 EricJoy2048

Hi, @Bingz2 thanks for your contribution, what's the news about this pr? Do you need some help? Sorry, I'm a little busy recently. I'll finish it this weekend

Thank you very much.

image An error in the CI,but I executed E2E locally with no errors,I don't know why.

Bingz2 avatar Oct 30 '22 09:10 Bingz2

Hi, @Bingz2 thanks for your contribution, what's the news about this pr? Do you need some help? Sorry, I'm a little busy recently. I'll finish it this weekend

Thank you very much.

image An error in the CI,but I executed E2E locally with no errors,I don't know why.

Don't worry, just resolve conflicts and retry the CI, Thanks.

EricJoy2048 avatar Oct 31 '22 09:10 EricJoy2048

@Bingz2 Will this pr go ahead? If not, I would like to take over and finish it

MonsterChenzhuo avatar Jan 17 '23 13:01 MonsterChenzhuo

@Bingz2 Will this pr go ahead? If not, I would like to take over and finish it

ok tks

Bingz2 avatar Jan 18 '23 06:01 Bingz2