cp-all-in-one icon indicating copy to clipboard operation
cp-all-in-one copied to clipboard

ksql-datagen exits. Can't connect to broker

Open jtravan3 opened this issue 2 years ago • 17 comments

Description I followed the steps listed here (https://docs.confluent.io/platform/current/quickstart/ce-docker-quickstart.html#) to start up a local Kafka instance on Docker. ksql-datagen exits after a couple of minutes due to no connection to the broker

Troubleshooting I think this may be related to issues #53, #51, #25, #24, #12, and #10. I'm seeing the same behavior.

1.) I run docker-compose up -d and I can see all of the containers up and running. 2.) http://locahost:9021 shows no response 2.) Then I run docker-compose ps and see that ksql-datagen has exited. 3.) Running docker logs ksql-datagen shows that it can't connect to the broker.

Connection to node -1 (broker/192.168.224.3:29092) could not be established. Broker may not be available.

I have made no changes to the docker-compose.yml

Environment

  • GitHub branch: 6.2.0-post
  • Operating System: MacOS Big Sur 11.5
  • Version of Docker: 3.5.2
  • Version of Docker Compose: 1.29.2

jtravan3 avatar Jul 24 '21 14:07 jtravan3

I'm not sure 192.168.x.x addresses would be correct within the Docker bridge, but did you allocate the 6GB of memory to Docker as that page suggests?

OneCricketeer avatar Aug 04 '21 18:08 OneCricketeer

@OneCricketeer I ended up backing down to version 5.5.5 (I'm not sure if that has anything to do with it) and then allocating 8 GB of memory to Docker instead of 6. After that, I can run it successfully. I have not tried to do the same with 6.2.0. Occasionaly, ksql-datagen will exit unexpectedly and I have to restart it but I have been able to stumble forward.

jtravan3 avatar Aug 04 '21 19:08 jtravan3

I don't think I've personally tried starting all components recently, but it usually is a memory issue from what I've seen. If you don't need the REST Proxy or Control Center, for example while you use ksql, you may comment them out in the yaml

OneCricketeer avatar Aug 04 '21 19:08 OneCricketeer

Clean install of docker. Clone of https://github.com/confluentinc/cp-all-in-one/blob/6.2.0-post/cp-all-in-one/docker-compose.yml, docker-compose up, and containers exit.

Memory allocated: 8gb

This goes beyond memory allocation - there is an issue with this docker compose config.

agates4 avatar Sep 07 '21 22:09 agates4

I just started that file (after removing REST Proxy and Connect services and references to them), and I get a page saying there's a healthy cluster on http://localhost:9021 , no containers have exited after 5 minutes

Mac: 11.5.2 Docker: 20.10.8 darwin/amd64

update Adding back Connect and REST Proxy still works; no exited containers

OneCricketeer avatar Sep 07 '21 23:09 OneCricketeer

Thank you for the comment - taking a double look and will report back.

agates4 avatar Sep 07 '21 23:09 agates4

The broker fails, triggering failure of the other images.

Mac: 11.4 (and I updated to 11.5.2, it doesn't work) Docker: 20.10.8 M1 ARM docker version

Steps:

  1. docker system prune
  2. Clone https://raw.githubusercontent.com/confluentinc/cp-all-in-one/6.2.0-post/cp-all-in-one/docker-compose.yml
  3. docker-compose up
  4. Wait
  5. See the following error in the broker container:
07 23:47:42,271] TRACE [Controller id=1 epoch=1] Received response UpdateMetadataResponseData(errorCode=0) for request UPDATE_METADATA with correlation id 10 sent to broker broker:29092 (id: 1 rack: null tags: []) (state.change.logger)

[2021-09-07 23:47:45,721] INFO Started o.e.j.s.ServletContextHandler@796a728c{/v1/metadata,null,AVAILABLE} (org.eclipse.jetty.server.handler.ContextHandler)

[2021-09-07 23:47:46,526] INFO Waiting for 8 seconds for metric reporter topic _confluent-telemetry-metrics to become available. (io.confluent.cruisecontrol.metricsreporter.ConfluentMetricsSamplerBase)

[2021-09-07 23:47:47,240] INFO Stopped o.e.j.s.ServletContextHandler@796a728c{/v1/metadata,null,STOPPED} (org.eclipse.jetty.server.handler.ContextHandler)

[2021-09-07 23:47:47,241] INFO node0 Stopped scavenging (org.eclipse.jetty.server.session)

[2021-09-07 23:47:47,286] WARN KafkaHttpServer transitioned from STARTING to FAILED.: org.glassfish.jersey.internal.spi.AutoDiscoverable: : java.net.MalformedURLException: no !/ found in url spec:file:/usr/share/java/confluent-security/schema-validator/jersey-common-2.34.jar!/META-INF/services/org.glassfish.jersey.internal.spi.AutoDiscoverable. (io.confluent.http.server.KafkaHttpServerImpl)

org.glassfish.jersey.internal.ServiceConfigurationError: org.glassfish.jersey.internal.spi.AutoDiscoverable: : java.net.MalformedURLException: no !/ found in url spec:file:/usr/share/java/confluent-security/schema-validator/jersey-common-2.34.jar!/META-INF/services/org.glassfish.jersey.internal.spi.AutoDiscoverable

image

agates4 avatar Sep 07 '21 23:09 agates4

I am getting the same issue with 6.1.0. I am now going to try 6.0.3

agates4 avatar Sep 08 '21 00:09 agates4

image My docker resources

agates4 avatar Sep 08 '21 00:09 agates4

I don't know what HTTP servlet Kafka (rather "Confluent Server") is trying to run, but I don't think that's related to the actual container crashing

You should try the community one instead, which runs a less customized Kafka broker https://github.com/confluentinc/cp-all-in-one/blob/6.2.0-post/cp-all-in-one-community/docker-compose.yml

OneCricketeer avatar Sep 08 '21 04:09 OneCricketeer

ah thank you - this seems to have gotten the Kafka broker working, but the schema registry is failing.

[2021-09-08 05:00:24,885] INFO [Schema registry clientId=sr-1, groupId=schema-registry] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
[2021-09-08 05:00:24,905] INFO [Schema registry clientId=sr-1, groupId=schema-registry] Successfully joined group with generation Generation{generationId=3, memberId='sr-1-273e5715-2660-4128-a38b-79034cf4eecc', protocol='v0'} (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
[2021-09-08 05:00:24,980] INFO [Schema registry clientId=sr-1, groupId=schema-registry] Successfully synced group in generation Generation{generationId=3, memberId='sr-1-273e5715-2660-4128-a38b-79034cf4eecc', protocol='v0'} (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
[2021-09-08 05:00:24,994] INFO Finished rebalance with leader election result: Assignment{version=1, error=0, leader='sr-1-273e5715-2660-4128-a38b-79034cf4eecc', leaderIdentity=version=1,host=schema-registry,port=8081,scheme=http,leaderEligibility=true} (io.confluent.kafka.schemaregistry.leaderelector.kafka.KafkaGroupLeaderElector)
[2021-09-08 05:00:25,059] INFO Wait to catch up until the offset at 3 (io.confluent.kafka.schemaregistry.storage.KafkaStore)
[2021-09-08 05:00:25,068] INFO Reached offset at 3 (io.confluent.kafka.schemaregistry.storage.KafkaStore)
Exception in thread "main" java.lang.ExceptionInInitializerError
	at org.eclipse.jetty.http.MimeTypes.<clinit>(MimeTypes.java:175)
	at org.eclipse.jetty.server.handler.gzip.GzipHandler.<init>(GzipHandler.java:190)
	at io.confluent.rest.ApplicationServer.wrapWithGzipHandler(ApplicationServer.java:468)
	at io.confluent.rest.ApplicationServer.wrapWithGzipHandler(ApplicationServer.java:477)
	at io.confluent.rest.ApplicationServer.finalizeHandlerCollection(ApplicationServer.java:213)
	at io.confluent.rest.ApplicationServer.doStart(ApplicationServer.java:230)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
	at io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain.main(SchemaRegistryMain.java:43)
Caused by: java.nio.charset.IllegalCharsetNameException: l;charset=iso-8859-1
	at java.base/java.nio.charset.Charset.checkName(Charset.java:308)
	at java.base/java.nio.charset.Charset.lookup2(Charset.java:482)
	at java.base/java.nio.charset.Charset.lookup(Charset.java:462)
	at java.base/java.nio.charset.Charset.forName(Charset.java:526)
	at org.eclipse.jetty.http.MimeTypes$Type.<init>(MimeTypes.java:107)
	at org.eclipse.jetty.http.MimeTypes$Type.<clinit>(MimeTypes.java:67)
	... 8 more

this seems related to char errors I was seeing with the broker in my previous tests. maybe there is something going on here with the 6.x.x releases.

agates4 avatar Sep 08 '21 05:09 agates4

https://github.com/localstack/localstack/issues/4456#issuecomment-905904064

I see an issue with alpine linux JDK on M1 with an illegal charset - potential problem here could be afflicting confluent images

agates4 avatar Sep 08 '21 05:09 agates4

Think you said you were running darwin/amd64, not M1 ARM docker version... In any case, you can specify a platform to emulate amd64 Docker runtime, I believe.

Last I checked, Confluent Platform is not supported on ARM based systems

OneCricketeer avatar Sep 08 '21 05:09 OneCricketeer

sorry about that - yeah I am running on m1. I will try specifying a platform

agates4 avatar Sep 08 '21 05:09 agates4

ok - eugenetea/schema-registry-arm64:latest is an image for schema registry that appears to work well. from this issue thread https://github.com/confluentinc/cp-docker-images/issues/718

I see an issue thread open about support for ARM here https://github.com/confluentinc/common-docker/issues/117 looks like all of my issues were ARM related. I'd be curious to see who else in this thread is on an ARM system.

agates4 avatar Sep 08 '21 05:09 agates4

to those looking for a working M1 ARM broker+zookeeper+schema-registry, here is my docker-compose.yml

---
version: '3.4'

services:
  broker:
    container_name: broker
    depends_on:
      - zookeeper
    environment:
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092
      KAFKA_BROKER_ID: 1
      KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
      KAFKA_JMX_HOSTNAME: localhost
      KAFKA_JMX_PORT: 9101
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
      KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
    hostname: broker
    image: confluentinc/cp-kafka:6.2.0
    ports:
      - '29092:29092'
      - '9092:9092'
      - '9101:9101'
  schema-registry:
    container_name: schema-registry
    depends_on:
      - broker
    environment:
      SCHEMA_REGISTRY_HOST_NAME: schema-registry
      SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: broker:29092
      SCHEMA_REGISTRY_LISTENERS: http://0.0.0.0:8081
    hostname: schema-registry
    image: eugenetea/schema-registry-arm64:latest
    platform: linux/arm64/v8
    ports:
      - '8081:8081'
  zookeeper:
    container_name: zookeeper
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181
      ZOOKEEPER_TICK_TIME: 2000
    hostname: zookeeper
    image: confluentinc/cp-zookeeper:6.2.0
    ports:
      - '2181:2181'

agates4 avatar Sep 08 '21 05:09 agates4

This is what worked for me in my M1 Chip:

  • zookeeper image: confluentinc/cp-zookeeper:latest.arm64
  • broker image: confluentinc/cp-server:latest.arm64
  • schema-registry image: confluentinc/cp-schema-registry:latest.arm64
  • connect image: cnfldemos/cp-server-connect-datagen:0.5.0-6.2.0 [I haven't tested with the latest cnfldemos/cp-server-connect-datagen:0.5.3-7.1.0]
  • control-center image: confluentinc/cp-enterprise-control-center:latest.arm64
  • ksqldb-server image: confluentinc/cp-ksqldb-server:latest.arm64
  • ksqldb-cli image: confluentinc/cp-ksqldb-cli:latest.arm64
  • ksql-datagen image: confluentinc/ksqldb-examples:latest.arm64
  • rest-proxy image: confluentinc/cp-kafka-rest:latest.arm64

https://github.com/confluentinc/cp-all-in-one/blob/7.1.1-post/cp-all-in-one/docker-compose.yml

nicolepastrana avatar Jul 17 '22 20:07 nicolepastrana