kafka-connect-arangodb icon indicating copy to clipboard operation
kafka-connect-arangodb copied to clipboard

Kafka arangodb fail to create collection if not present in target database

Open vaibhavpatil123 opened this issue 6 years ago • 3 comments

I have below configuration

{
    "name": "development-arangodb-connector",
  
          "connector.class": "io.github.jaredpetersen.kafkaconnectarangodb.sink.ArangoDbSinkConnector",
        "tasks.max": "1",
        "topics": "customers",
        "arangodb.host": "192.168.56.1",
        "arangodb.port": 8529,
        "arangodb.user": "root",
        "arangodb.password": "admin",
        "transforms": "cdc",

"transforms.cdc.type": "io.github.jaredpetersen.kafkaconnectarangodb.sink.transforms.Cdc", "arangodb.database.name": "development",

		"arangodb.batch.size":"100",
		"arangodb.max.retries":"10",
		"arangodb.writer.impl":"kafka.connect.arangodb.ArangoDBWriter"
    }

Issue Synch connector not creating collection on target database if that is not present .

vaibhavpatil123 avatar Jun 22 '19 12:06 vaibhavpatil123

Yup, this is correct behavior. Kafka Connect ArangoDB does not create collections or databases at all -- it's your responsibility to do so beforehand. While it's theoretically possible to figure out what kind of collection you will need (edge collections are the ones with _to and _from fields so look for those fields), there's no way to predict what kind of indices you're going to need. Indices are super important to having a performant database and auto-creating collections enables users to forget about them.

It looks like this isn't documented explicitly like I thought it was. Even the development docs hides the database collection creation away from you. I'll keep this issue open as a reminder to document this.

Thanks for bringing it up!

jaredpetersen avatar Jun 22 '19 18:06 jaredpetersen

Thanks for reply on time. Do we have any plan to write Source Kafka connector ? :)

vaibhavpatil123 avatar Jun 23 '19 08:06 vaibhavpatil123

That's definitely one of the next things I'd like to do with this. I've been working converting the existing docker compose development setup to use a clustered ArangoDB via Kubernetes first. The clustered form is much more difficult to use as a source system due to the architecture but starting with that first helps us avoid writing code into a corner.

jaredpetersen avatar Jun 23 '19 17:06 jaredpetersen