starlight-for-kafka icon indicating copy to clipboard operation
starlight-for-kafka copied to clipboard

[BUG] 307 error on `public/__kafka_producerid/__transaction_producerid_generator` creation when pulsar standalone is restarted as a systemd service

Open MMirelli opened this issue 3 years ago • 0 comments

Describe the bug The broker keeps sending 307 error messages to

PUT /admin/v2/persistent/public/__kafka_producerid/__transaction_producerid_generator?authoritative=true HTTP/1.1

when pulsar standalone is restarted as a systemd service.

This seems to be caused by a race condition: probably the PUT above is issued even before the bundle containing that topic is assigned to the only standalone broker.

To Reproduce Steps to reproduce the behavior:

  1. vagrant init generic/rhel7 && vagrant up && vagrant ssh
  2. copy / paste the service to /etc/systemd/system/pulsar-standalone1.service:
[Unit]
Description=Pulsar standalone debug race condition

[Service]
WorkingDirectory=/home/vagrant/apache-pulsar-2.10.1
Type=simple
Environment=JAVA_HOME=/usr/local/jdk-11.0.1 PATH=/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/local/jdk-11.0.1/bin 
ExecStart=/bin/bash -c "./start_standalone.sh"
KillMode=mixed

[Install]
WantedBy=multi-user.target
  1. Add to $HOME/apache-pulsar-2.10.1/conf/standalone.conf
messagingProtocols=kafka
kafkaTransactionCoordinatorEnabled=true
kafkaEnableMultiTenantMetadata=true
kafkaNamespace=kafka
kafkaListeners=SASL_PLAINTEXT://127.0.0.1:9092
kafkaAdvertisedListeners=SASL_PLAINTEXT://127.0.0.1:9092
kafkaManageSystemNamespaces=true
  1. sudo systemctl start pulsar-standalone1.service
  2. pulsar_admin --admin-url "http://localhost:8080" tenants create tenant1
  3. pulsar_admin --admin-url "http://localhost:8080" namespaces create tenant1/kafka
  4. ./kafka_2.12-3.2.0/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic topic1 --from-beginning --consumer.config kafka_client_kafkacluster1.properties 9092 # in console-0
  5. ./kafka_2.12-3.2.0/bin/kafka-console-producer.sh --bootstrap-server localhost:9092 --topic topic1 --producer.config kafka_client_kafkacluster1.properties 9092 # in console-1
  6. The messages produced in console-1 will appear in console-0.
  7. sudo systemctl restart pulsar-standalone1.service
  8. The service doesn't start up and the below appears:
Jul 20 20:14:22 rhel7.localdomain bash[11095]: 2022-07-20T20:14:22,380+0000 [pulsar-web-46-5] INFO  org.eclipse.jetty.server.RequestLo
g - 127.0.0.1 - - [20/Jul/2022:20:14:22 +0000] "PUT /admin/v2/persistent/public/__kafka_producerid/__transaction_producerid_generator 
HTTP/1.1" 307 0 "-" "Pulsar-Java-v2.10.1" 18
...
Jul 20 20:14:22 rhel7.localdomain bash[11095]: 2022-07-20T20:14:22,523+0000 [AsyncHttpClient-70-1] WARN  org.apache.pulsar.client.admin.internal.BaseResource - [http://localhost:8080/admin/v2/persistent/public/__kafka_producerid/__transaction_producerid_generator] Failed to perform http put request: java.util.concurrent.CompletionException: org.apache.pulsar.client.admin.internal.http.AsyncHttpConnector$RetryException: Could not complete the operation. Number of retries has been exhausted. Failed reason: Maximum redirect reached: 5
Jul 20 20:14:22 rhel7.localdomain bash[11095]: 2022-07-20T20:14:22,523+0000 [main] ERROR io.streamnative.pulsar.handlers.kop.utils.MetadataUtils - Failed to successfully initialize Kafka Metadata public/__kafka_producerid
Jul 20 20:14:22 rhel7.localdomain bash[11095]: org.apache.pulsar.client.admin.PulsarAdminException: java.util.concurrent.CompletionException: org.apache.pulsar.client.admin.internal.http.AsyncHttpConnector$RetryException: Could not complete the operation. Number of retries has been exhausted. Failed reason: Maximum redirect reached: 5

Expected behavior The standalone cluster should be able to get restarted, without the error above being displayed.

Additional context This was reproed on a RHEL VM started up on vagrant. Dependecies:

  • apache-pulsar-2.10.1
  • kafka_2.12-3.2.0

MMirelli avatar Jul 20 '22 20:07 MMirelli