cp-docker-images Auto-load connectors from directory

Auto-load connectors from directory

Open OneCricketeer opened this issue 6 years ago • 15 comments

Related to confluentinc/cp-docker-images#460

Should add some directory in kafka-connect-base that loads .json or .properties files on start.

See MySQL container for inspiration

Apr 23 '18 23:04 OneCricketeer

One alternative, as shown by @rmoff

In the compose, override the container command

  volumes:
    - $PWD/scripts:/scripts  # TODO: Create this folder ahead of time, on your host
  command: 
    - bash 
    - -c 
    - |
      /etc/confluent/docker/run & 
      echo "Waiting for Kafka Connect to start listening on kafka-connect ⏳"
      while [ $$(curl -s -o /dev/null -w %{http_code} http://kafka-connect:8083/connectors) -eq 000 ] ; do 
        echo -e $$(date) " Kafka Connect listener HTTP state: " $$(curl -s -o /dev/null -w %{http_code} http://kafka-connect:8083/connectors) " (waiting for 200)"
        sleep 5 
      done
      nc -vz kafka-connect 8083
      echo -e "\n--\n+> Creating Kafka Connector(s)"
      /scripts/create-connectors.sh  # Note: This script is stored externally from container
      sleep infinity

Feb 06 '19 17:02 OneCricketeer

Thanks - this was helpful! I prefer a slightly optimized version:

 volumes:
    - $PWD/scripts:/scripts  # TODO: Create this folder ahead of time, on your host
  command: 
    - bash 
    - -c 
    - |
      /etc/confluent/docker/run & 
      echo "Waiting for Kafka Connect to start listening on kafka-connect ⏳"
      while : ; do
        curl_status=$$(curl -s -o /dev/null -w %{http_code} http://kafka-connect:8083/connectors)
        echo -e $$(date) " Kafka Connect listener HTTP state: " $$curl_status " (waiting for 200)"
        if [ $$curl_status -eq 200 ] ; then
          break
        fi
        sleep 5 
      done
      echo -e "\n--\n+> Creating Kafka Connector(s)"
      /scripts/create-connectors.sh  # Note: This script is stored externally from container
      sleep infinity

Changes over the above version:

Only run curl once per iteration - no need to call it a second time just for printing the status. Also only one place to change the hostname and port.
Check for status 200 instead of a status different than 000. I've had Connect return a 404 status during startup (since the HTTP port was up, but the endpoint was not deployed yet), in which case the create-connectors.sh script failed. Waiting for 200 ensures that the connectors endpoint is available.
Removed the netcat call - not sure what that was needed for. Looks like a leftover from a previous version of the script...

FWIW, I run this from a separate service in my Docker Compose file - the image I use is appropriate/curl:latest. That way, you don't have to start the run command in the background...

Aug 27 '19 14:08 nwinkler

@nwinkler nice tips, thanks for sharing!

Sep 02 '19 11:09 rmoff

:bulb: If you want to run this command from a sh file do:

echo "Waiting for Kafka Connect to start listening on kafka-connect ⏳"
while true
do
    curl_status="$(curl -s -o /dev/null -w '%{http_code}' 'http://kafka-connect:8083/connectors')"
    if [ $curl_status -eq 200 ]
    then
        break
    fi
    echo -e "$(date)" " Kafka Connect listener HTTP state: " $curl_status " (waiting for 200)"
    sleep 5 
done
/scripts/create-connectors.sh  # Note: This script is stored externally from container
sleep infinity

Oct 19 '21 08:10 Matesanz

Thanks, @Matesanz

That looks like the same I already posted https://github.com/confluentinc/cp-docker-images/issues/467#issuecomment-461104319

Oct 19 '21 09:10 OneCricketeer

Thanks, @Matesanz

That looks like the same I already posted #467 (comment)

Yes, I know. But its not the same, trying to run that code from an sh file will result in error:

line 21: syntax error near unexpected token `('
line 21: `    curl_status =$$(curl -s -o /dev/null -w %{http_code} http://kafka-connect:8083/connectors)

Oct 19 '21 09:10 Matesanz

You still need to modify the container command execution to run different files, though

This issue was opened to not need that

Oct 19 '21 10:10 OneCricketeer

It's still useful for those looking on how to auto load a connector.

Also official tutorials on this topic use an external file (https://github.com/mitch-seymour/mastering-kafka-streams-and-ksqldb/blob/master/chapter-09/files/ksqldb-server/run.sh). And somehow they could end up here (as I did).

Oct 19 '21 10:10 Matesanz

I also found really useful your response on this topic here

Other solution would be start connect-distributed once, anywhere, configure the internal topics, post a connector (which saves to config topic), then start N containers, and they all pick up the same config

Its not easy to figure out how to configure properly a connector using connect-standalone and connect-distributed scripts

Oct 19 '21 10:10 Matesanz

It's not possible to use standalone because the connectors wouldn't be persistent. The Connect container doesn't need to change; it already runs connect distributed script

My suggestion was to use at least two different containers, one that starts Connect server alone with all needed plugins, then another that iterates over a list of mounted JSON files and posts them all, using file name as connector name, for example.

If that init container happens to restart, then connector with those names already exists, so no harm done

Oct 19 '21 10:10 OneCricketeer

official tutorials on this topic use an external file

I don't see where that script is used in the compose file

Oct 19 '21 10:10 OneCricketeer

official tutorials on this topic use an external file

I don't see where that script is used in the compose file

here: https://github.com/mitch-seymour/mastering-kafka-streams-and-ksqldb/blob/924bc71b394baf3284c21dedc498b8f5e98898b9/chapter-09/docker-compose.yml#L37

Oct 19 '21 10:10 Matesanz

Ah. My bad, was looking for a Connect container rather than ksqlDB

Oct 19 '21 10:10 OneCricketeer

It's not possible to use standalone because the connectors wouldn't be persistent. The Connect container doesn't need to change; it already runs connect distributed script

Sorry for my ignorance but why wouldn't the connectors persist in Standalone?

Oct 19 '21 10:10 Matesanz

The Connect containers are ephemeral. Connect in standalone mode isn't configured with the three internal topics to store configs, statuses or (source) offsets.

https://docs.confluent.io/platform/current/connect/concepts.html

Oct 19 '21 13:10 OneCricketeer

cp-docker-images cp-docker-images copied to clipboard

Auto-load connectors from directory

cp-docker-images
cp-docker-images copied to clipboard