seatunnel icon indicating copy to clipboard operation
seatunnel copied to clipboard

MongoDB - Sink - PluginIdentifier not found

Open a11dev opened this issue 9 months ago • 14 comments

Search before asking

  • [X] I had searched in the issues and found no similar issues.

What happened

simple seatunnel configuration , designed to sink oracle to mongodb. Oracle to postgres is working. I added a new sink

	MongoDB {
		source_table_name = "source"
		uri = "mongodb://user:pwd@ipaddress:27017/dbname?readPreference=secondary&slaveOk=true"
		database = "dbname"
		collection = "destcollection"
		upsert-enable = true
		primary-key = ["pkname"]

	}

it ends with this exception.

Is related to mongodb java library, but which and where must it be installed. I've done it but no solution I tried is working.

I tried with :

mongodb-driver-core-5.1.0
mongodb-java-driver-5.1.0

and as alternative:

mongodb-driver-sync-5.1.0

placing it into: seatunnel_home\lib plugins\MongoDB\lib connectors\seatunnel

documentation link to the right download is failing.

SeaTunnel Version

2.3.4

SeaTunnel Config

#Also into the first attachment:


env {
	parallelism = 2
    job.mode=STREAMING
    job.name=SeaTunnel_Job
    read_limit.bytes_per_second=7000000
    read_limit.rows_per_second=400
}

  Oracle-CDC {

    result_table_name = "tab1"
    base-url = "jdbc:oracle:thin:user/password@ip:1521/service_name"
    source.reader.close.timeout = 120000
    username = "user"
    password = "password"
    database-names = ["DBNAME"]
	# real db name DBNAME.domain.local ( it works with DBNAME )
    schema-names = ["SCHEMA"]
    startup.mode = "INITIAL"
    table-names = ["DBNAME.SCHEMA.TABLE1"]
  }


}


sink {
	MongoDB {
		source_table_name = "source"
		uri = "mongodb://user:pwd@ipaddress:27017/dbname?readPreference=secondary&slaveOk=true"
		database = "dbname"
		collection = "destcollection"
		upsert-enable = true
		primary-key = ["pkname"]

	}

}


### Running Command

```shell
java -Dlog4j2.configurationFile=E:\programmi\apache-seatunnel-2.3.4\config\log4j2_client.properties -Dhazelcast.client.config=E:\programmi\apache-seatunnel-2.3.4\config\hazelcast-client.yaml -Dseatunnel.config=E:\programmi\apache-seatunnel-2.3.4\config\seatunnel.yaml -Dhazelcast.config=E:\programmi\apache-seatunnel-2.3.4\config\hazelcast.yaml -Dseatunnel.logs.file_name=seatunnel-starter-clienttest -Xms256m -Xmx512m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=E:\programmi\apache-seatunnel-2.3.4\dump\zeta-client  -cp E:\programmi\apache-seatunnel-2.3.4\lib\*;E:\programmi\apache-seatunnel-2.3.4\starter\seatunnel-starter.jar org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient  --config .\config\v2.batch.config.template -m local

Error Exception

2024-05-06 13:42:06,610 ERROR [o.a.s.c.s.SeaTunnel           ] [main] - Fatal Error, 

2024-05-06 13:42:06,611 ERROR [o.a.s.c.s.SeaTunnel           ] [main] - Please submit bug report in https://github.com/apache/seatunnel/issues

2024-05-06 13:42:06,612 ERROR [o.a.s.c.s.SeaTunnel           ] [main] - Reason:SeaTunnel job executed failed 

2024-05-06 13:42:06,614 ERROR [o.a.s.c.s.SeaTunnel           ] [main] - Exception StackTrace:org.apache.seatunnel.core.starter.exception.CommandExecuteException: SeaTunnel job executed failed
	at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:199)
	at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
	at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34)
Caused by: java.lang.RuntimeException: Plugin PluginIdentifier{engineType='seatunnel', pluginType='sink', pluginName='MongoDB'} not found.
	at org.apache.seatunnel.plugin.discovery.AbstractPluginDiscovery.createPluginInstance(AbstractPluginDiscovery.java:231)
	at org.apache.seatunnel.engine.core.parse.ConnectorInstanceLoader.loadSinkInstance(ConnectorInstanceLoader.java:77)
	at org.apache.seatunnel.engine.core.parse.JobConfigParser.parseSink(JobConfigParser.java:194)
	at org.apache.seatunnel.engine.core.parse.JobConfigParser.parseSinks(JobConfigParser.java:170)
	at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parseSink(MultipleTableJobConfigParser.java:531)
	at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parse(MultipleTableJobConfigParser.java:193)
	at org.apache.seatunnel.engine.client.job.ClientJobExecutionEnvironment.getLogicalDag(ClientJobExecutionEnvironment.java:88)
	at org.apache.seatunnel.engine.client.job.ClientJobExecutionEnvironment.execute(ClientJobExecutionEnvironment.java:161)
	at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:146)
	... 2 more

Zeta or Flink or Spark Version

Zeta

Java or Scala Version

JAva

Screenshots

No response

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

Code of Conduct

a11dev avatar May 06 '24 12:05 a11dev

Could you try with 2.3.5? It should be fixed by #6551

Hisoka-X avatar May 09 '24 03:05 Hisoka-X

Thanks I will try that. ( with 2.3.4 connectors ) From where connectors 2.3.5 are available? Docs links are failing!

Alessandro

Il giorno gio 9 mag 2024 alle ore 05:44 Jia Fan @.***> ha scritto:

Could you try with 2.3.5? It should be fixed by #6551 https://github.com/apache/seatunnel/pull/6551

— Reply to this email directly, view it on GitHub https://github.com/apache/seatunnel/issues/6800#issuecomment-2101877591, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI2V56Z6AEJEY23ZCJHATVDZBLWJHAVCNFSM6AAAAABHI7W73CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBRHA3TONJZGE . You are receiving this because you authored the thread.Message ID: @.***>

a11dev avatar May 09 '24 07:05 a11dev

You can get download link from https://www.apache.org/dyn/closer.lua/seatunnel/2.3.5/apache-seatunnel-2.3.5-bin.tar.gz

Hisoka-X avatar May 09 '24 07:05 Hisoka-X

Again:

2024-05-09 08:16:48,437 ERROR [o.a.s.c.s.SeaTunnel ] [main] - Exception StackTrace:org.apache.seatunnel.core.starter.exception.CommandExecuteException: SeaTunnel job executed failed at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:202) at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40) at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34) Caused by: java.lang.RuntimeException: Plugin PluginIdentifier{engineType='seatunnel', pluginType='sink', pluginName='MongoDB'} not found. at org.apache.seatunnel.plugin.discovery.AbstractPluginDiscovery.createPluginInstance(AbstractPluginDiscovery.java:234) at org.apache.seatunnel.engine.core.parse.ConnectorInstanceLoader.loadSinkInstance(ConnectorInstanceLoader.java:77) at org.apache.seatunnel.engine.core.parse.JobConfigParser.parseSink(JobConfigParser.java:159) at org.apache.seatunnel.engine.core.parse.JobConfigParser.parseSinks(JobConfigParser.java:135) at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parseSink(MultipleTableJobConfigParser.java:517) at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parse(MultipleTableJobConfigParser.java:200) at org.apache.seatunnel.engine.client.job.ClientJobExecutionEnvironment.getLogicalDag(ClientJobExecutionEnvironment.java:88) at org.apache.seatunnel.engine.client.job.ClientJobExecutionEnvironment.execute(ClientJobExecutionEnvironment.java:156) at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:149) ... 2 more

The point is, which jar is missing and where must be distributed. since I'm using Zeta engine I should locate it into the seatunnel .\lib. I added there the mongodb-driver-sync-5.1.0.jar.

Mongo db sink: MongoDB { source_table_name = "assignments" uri = @.*** :27017/dbname?readPreference=secondary&slaveOk=true" database = "dbname" collection = "assignments" upsert-enable = true primary-key = ["w6key"] }

thanks Alessandro

Il giorno gio 9 mag 2024 alle ore 05:44 Jia Fan @.***> ha scritto:

Could you try with 2.3.5? It should be fixed by #6551 https://github.com/apache/seatunnel/pull/6551

— Reply to this email directly, view it on GitHub https://github.com/apache/seatunnel/issues/6800#issuecomment-2101877591, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI2V56Z6AEJEY23ZCJHATVDZBLWJHAVCNFSM6AAAAABHI7W73CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBRHA3TONJZGE . You are receiving this because you authored the thread.Message ID: @.***>

a11dev avatar May 09 '24 07:05 a11dev

did you execute bin/install-plugin.sh ? https://seatunnel.apache.org/docs/2.3.5/start-v2/locally/deployment#step-3-install-connectors-plugin

Hisoka-X avatar May 09 '24 08:05 Hisoka-X

Connectors are not available.

This is the install_plugin output: [...] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-dependency-plugin:2.8:get (default-cli) on project standalone-pom: Couldn't download artifact: Missing: [ERROR] ---------- [ERROR] 1) org.apache.seatunnel: connector-mongodb:jar:2.3.5 [ERROR] [ERROR] Try downloading the file manually from the project website. [ERROR] [ERROR] Then, install it using the command: [ERROR] mvn install:install-file -DgroupId=org.apache.seatunnel -DartifactId= connector-mongodb -Dversion=2.3.5 -Dpackaging=jar -Dfile=/path/to/file [ERROR] [ERROR] Alternatively, if you host your own repository you can deploy the file there: [ERROR] mvn deploy:deploy-file -DgroupId=org.apache.seatunnel -DartifactId= connector-mongodb -Dversion=2.3.5 -Dpackaging=jar -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id] [ERROR] [ERROR] Path to dependency: [ERROR] 1) org.apache.maven.plugins:maven-downloader-plugin:jar:1.0 [ERROR] 2) org.apache.seatunnel: connector-mongodb:jar:2.3.5 [ERROR] [ERROR] ---------- [ERROR] 1 required artifact is missing. [...]

also from seatunnel website connectos llinks are not working , it seems connectors lib are no more available from the repository. the url might be changed?

thanks Alessandro

Il giorno gio 9 mag 2024 alle ore 10:15 Jia Fan @.***> ha scritto:

did you execute bin/install-plugin.sh ? https://seatunnel.apache.org/docs/2.3.5/start-v2/locally/deployment#step-3-install-connectors-plugin

— Reply to this email directly, view it on GitHub https://github.com/apache/seatunnel/issues/6800#issuecomment-2102178993, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI2V564WR43ERTGBR7GP4HDZBMWDVAVCNFSM6AAAAABHI7W73CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBSGE3TQOJZGM . You are receiving this because you authored the thread.Message ID: @.***>

a11dev avatar May 13 '24 10:05 a11dev

cc @liugddx

Hisoka-X avatar May 15 '24 02:05 Hisoka-X

Hi Jia, I resumed my seatunnel tests. Still trying to sync mongo from oracle cdc.

Firstly as written apache seatunnel documentation is pointing to a wrong maven repository. In the end I found the connector component manually navigating https://mvnrepository.com/artifact/org.apache.seatunnel. The plugin installer is throwing an exception because the maven repository is not available. Documentation, related to manual installation, doesn't mention other libraries needed mongodb-driver-sync-4.7.1 and mongodb-driver-core-4.7.1 as requested from the connector manifes.

Btw this I achieved it.

Now I'm able to run the documentation example but not a real scenario , I've a big oracle table. I like to transform it into a document into a mongo collection, to make it simple I applied a transformation in the middle : [...] transform { Sql { source_table_name = "assignments" result_table_name = "mongoassignments" query = "select pkname from assignments" } } [...]

MongoDB { source_table_name = ["mongoassignments"] uri = @.***:27017/?authSource=dbname" database = "dbname" collection = "assignmentskeys" upsert-enable = true primary-key = [" pkname "]

schema = {
  fields {
    _id = string
      pkname   = bigint
      }
}

}

but it throws such an error: java: Caused by: com.mongodb.MongoBulkWriteException: Bulk write operation error on server XXXXXX:27017. Write errors: [BulkWriteError{index=0, code=2, message='$and/$or/$nor must be a nonempty array', details={}}].

any idea?

Do you think, might be possible to set up the denv env and try debugging it?

thanks a lot Alessandro

Il giorno mer 15 mag 2024 alle ore 04:14 Jia Fan @.***> ha scritto:

cc @liugddx https://github.com/liugddx

— Reply to this email directly, view it on GitHub https://github.com/apache/seatunnel/issues/6800#issuecomment-2111454976, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI2V566DX2V2W52QBLOUQCTZCLAIJAVCNFSM6AAAAABHI7W73CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJRGQ2TIOJXGY . You are receiving this because you authored the thread.Message ID: @.***>

a11dev avatar Jun 24 '24 12:06 a11dev

Sorry... I did! the basic transfer is working now, no help needed on this topic. Sorry I disturbed you! Is there a way to update a collection in order to perform a dernomalization along the oracle to mongo sync?

Ale

Il giorno lun 24 giu 2024 alle ore 14:54 Alessandro Leonardi < @.***> ha scritto:

Hi Jia, I resumed my seatunnel tests. Still trying to sync mongo from oracle cdc.

Firstly as written apache seatunnel documentation is pointing to a wrong maven repository. In the end I found the connector component manually navigating https://mvnrepository.com/artifact/org.apache.seatunnel. The plugin installer is throwing an exception because the maven repository is not available. Documentation, related to manual installation, doesn't mention other libraries needed mongodb-driver-sync-4.7.1 and mongodb-driver-core-4.7.1 as requested from the connector manifes.

Btw this I achieved it.

Now I'm able to run the documentation example but not a real scenario , I've a big oracle table. I like to transform it into a document into a mongo collection, to make it simple I applied a transformation in the middle : [...] transform { Sql { source_table_name = "assignments" result_table_name = "mongoassignments" query = "select pkname from assignments" } } [...]

MongoDB { source_table_name = ["mongoassignments"] uri = @.***:27017/?authSource=dbname" database = "dbname" collection = "assignmentskeys" upsert-enable = true primary-key = [" pkname "]

schema = {
  fields {
    _id = string
      pkname   = bigint
      }
}

}

but it throws such an error: java: Caused by: com.mongodb.MongoBulkWriteException: Bulk write operation error on server XXXXXX:27017. Write errors: [BulkWriteError{index=0, code=2, message='$and/$or/$nor must be a nonempty array', details={}}].

any idea?

Do you think, might be possible to set up the denv env and try debugging it?

thanks a lot Alessandro

Il giorno mer 15 mag 2024 alle ore 04:14 Jia Fan @.***> ha scritto:

cc @liugddx https://github.com/liugddx

— Reply to this email directly, view it on GitHub https://github.com/apache/seatunnel/issues/6800#issuecomment-2111454976, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI2V566DX2V2W52QBLOUQCTZCLAIJAVCNFSM6AAAAABHI7W73CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJRGQ2TIOJXGY . You are receiving this because you authored the thread.Message ID: @.***>

a11dev avatar Jun 24 '24 13:06 a11dev

Is there a way to update a collection in order to perform a dernomalization along the oracle to mongo sync?

not sure what you want, could you share some example?

Hisoka-X avatar Jun 25 '24 02:06 Hisoka-X

yes, source oracle, table A and B destination mongodb, collection called AB where documents are created embedding B into A {A,{B}} Is there a way to configure a sink in order to embed B into A?

Thanks a lot. Alessandro

Il giorno mar 25 giu 2024 alle ore 04:16 Jia Fan @.***> ha scritto:

Is there a way to update a collection in order to perform a dernomalization along the oracle to mongo sync?

not sure what you want, could you share some example?

— Reply to this email directly, view it on GitHub https://github.com/apache/seatunnel/issues/6800#issuecomment-2187820246, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI2V567IY727CZT5A3L5LNTZJDHGPAVCNFSM6AAAAABHI7W73CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBXHAZDAMRUGY . You are receiving this because you authored the thread.Message ID: @.***>

a11dev avatar Jun 25 '24 05:06 a11dev

Sorry, seatunnel can not do this at now.

Hisoka-X avatar Jun 25 '24 05:06 Hisoka-X

Thanks

Alessandro

Il giorno mar 25 giu 2024 alle ore 07:58 Jia Fan @.***> ha scritto:

Sorry, seatunnel can not do this at now.

— Reply to this email directly, view it on GitHub https://github.com/apache/seatunnel/issues/6800#issuecomment-2188041020, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI2V56ZSGO3XYN67GSLQTVDZJEBILAVCNFSM6AAAAABHI7W73CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBYGA2DCMBSGA . You are receiving this because you authored the thread.Message ID: @.***>

a11dev avatar Jun 25 '24 06:06 a11dev

Again, thank you for your kind response; I would also like to compliment you! You all have done a fantastic job; Seatunnel is an excellent tool for data synchronization, a single scalable application to manage all sources and destinations without additional components that increase the complexity of configuration and maintenance.

Thanks Ale

Il giorno mar 25 giu 2024 alle ore 07:58 Jia Fan @.***> ha scritto:

Sorry, seatunnel can not do this at now.

— Reply to this email directly, view it on GitHub https://github.com/apache/seatunnel/issues/6800#issuecomment-2188041020, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI2V56ZSGO3XYN67GSLQTVDZJEBILAVCNFSM6AAAAABHI7W73CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBYGA2DCMBSGA . You are receiving this because you authored the thread.Message ID: @.***>

a11dev avatar Jun 25 '24 06:06 a11dev