seatunnel icon indicating copy to clipboard operation
seatunnel copied to clipboard

[Feature][Connector-e2e] add ftp e2e test

Open MonsterChenzhuo opened this issue 1 year ago • 3 comments

Purpose of this pull request

close #4501

Check list

  • [ ] Code changed are covered with tests, or it does not need tests for reason:
  • [ ] If any new Jar binary package adding in your PR, please add License Notice according New License Guide
  • [ ] If necessary, please update the documentation to describe the new feature. https://github.com/apache/incubator-seatunnel/tree/dev/docs
  • [ ] If you are contributing the connector code, please check that the following files are updated:
    1. Update change log that in connector document. For more details you can refer to connector-v2
    2. Update plugin-mapping.properties and add new connector information in it
    3. Update the pom file of seatunnel-dist
  • [ ] Update the release-note.

MonsterChenzhuo avatar Apr 22 '23 15:04 MonsterChenzhuo

@TyrantLucifer
At the beginning, I tried to exclude the commons-net package by the hadoop dependency in the shade, and executed the flink engine to find an error in the content:

java.lang.LinkageError: loader constraint violation: loader (instance of sun/misc/Launcher$AppClassLoader) previously initiated loading for a different type with name "org/apache/commons/net/ftp/FTPClient"
	at org.apache.hadoop.fs.ftp.FTPInputStream.<init>(FTPInputStream.java:44)

The exception is thrown by hadoop.fs.ftp.FTPInputStream, so I think this package cannot be ruled out.

I will remove the commons-net dependency under connector-file-base and load the ftp-related dependency through seatunnel-hadoop3-3.1.4-uber.

After removal, the flink and zeta engines executed successfully, but spark reported the following error:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/net/ftp/FTPClient
	at org.apache.seatunnel.connectors.seatunnel.file.ftp.system.SeaTunnelFTPFileSystem.connect(SeaTunnelFTPFileSystem.java:131)
	at org.apache.seatunnel.connectors.seatunnel.file.ftp.system.SeaTunnelFTPFileSystem.listStatus(SeaTunnelFTPFileSystem.java:389)
	at org.apache.seatunnel.connectors.seatunnel.file.source.reader.AbstractReadStrategy.getFileNamesByPath(AbstractReadStrategy.java:123)
	at org.apache.seatunnel.connectors.seatunnel.file.ftp.source.FtpFileSource.prepare(FtpFileSource.java:83)
	at org.apache.seatunnel.core.starter.spark.execution.SourceExecuteProcessor.initializePlugins(SourceExecuteProcessor.java:104)
	at org.apache.seatunnel.core.starter.spark.execution.SparkAbstractPluginExecuteProcessor.<init>(SparkAbstractPluginExecuteProcessor.java:49)
	at org.apache.seatunnel.core.starter.spark.execution.SourceExecuteProcessor.<init>(SourceExecuteProcessor.java:51)
	at org.apache.seatunnel.core.starter.spark.execution.SparkExecution.<init>(SparkExecution.java:57)
	at org.apache.seatunnel.core.starter.spark.command.SparkTaskExecuteCommand.execute(SparkTaskExecuteCommand.java:59)
	at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
	at org.apache.seatunnel.core.starter.spark.SeaTunnelSpark.main(SeaTunnelSpark.java:35)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.commons.net.ftp.FTPClient
	at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
	... 23 more

I don't know if there is a better solution for the time being...

MonsterChenzhuo avatar May 03 '23 17:05 MonsterChenzhuo

@TyrantLucifer I have turned off #4702 and confirmed that it is a permission problem. After reconfiguring the permissions of ftp, it has been submitted to the current branch.

MonsterChenzhuo avatar May 11 '23 07:05 MonsterChenzhuo

I don't know if there is a better solution for the time being...

Because spark use their own hadoop depenency. Not seatunnel-hadoop3-3.1.4-uber. So maybe you should put commons-net.jar into spark lib (or jars? I forget :) ) directory. If it work, please add this way in the doc.

Hisoka-X avatar May 15 '23 10:05 Hisoka-X