starlight-for-kafka
starlight-for-kafka copied to clipboard
[BUG] 307 error on `public/__kafka_producerid/__transaction_producerid_generator` creation when pulsar standalone is restarted as a systemd service
Describe the bug
The broker keeps sending 307 error messages to
PUT /admin/v2/persistent/public/__kafka_producerid/__transaction_producerid_generator?authoritative=true HTTP/1.1
when pulsar standalone is restarted as a systemd service.
This seems to be caused by a race condition: probably the PUT above is issued even before the bundle containing that topic is assigned to the only standalone broker.
To Reproduce Steps to reproduce the behavior:
-
vagrant init generic/rhel7 && vagrant up && vagrant ssh - copy / paste the service to
/etc/systemd/system/pulsar-standalone1.service:
[Unit]
Description=Pulsar standalone debug race condition
[Service]
WorkingDirectory=/home/vagrant/apache-pulsar-2.10.1
Type=simple
Environment=JAVA_HOME=/usr/local/jdk-11.0.1 PATH=/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/local/jdk-11.0.1/bin
ExecStart=/bin/bash -c "./start_standalone.sh"
KillMode=mixed
[Install]
WantedBy=multi-user.target
- Add to
$HOME/apache-pulsar-2.10.1/conf/standalone.conf
messagingProtocols=kafka
kafkaTransactionCoordinatorEnabled=true
kafkaEnableMultiTenantMetadata=true
kafkaNamespace=kafka
kafkaListeners=SASL_PLAINTEXT://127.0.0.1:9092
kafkaAdvertisedListeners=SASL_PLAINTEXT://127.0.0.1:9092
kafkaManageSystemNamespaces=true
-
sudo systemctl start pulsar-standalone1.service -
pulsar_admin --admin-url "http://localhost:8080" tenants create tenant1 -
pulsar_admin --admin-url "http://localhost:8080" namespaces create tenant1/kafka -
./kafka_2.12-3.2.0/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic topic1 --from-beginning --consumer.config kafka_client_kafkacluster1.properties 9092 # in console-0 -
./kafka_2.12-3.2.0/bin/kafka-console-producer.sh --bootstrap-server localhost:9092 --topic topic1 --producer.config kafka_client_kafkacluster1.properties 9092 # in console-1 - The messages produced in console-1 will appear in console-0.
-
sudo systemctl restart pulsar-standalone1.service - The service doesn't start up and the below appears:
Jul 20 20:14:22 rhel7.localdomain bash[11095]: 2022-07-20T20:14:22,380+0000 [pulsar-web-46-5] INFO org.eclipse.jetty.server.RequestLo
g - 127.0.0.1 - - [20/Jul/2022:20:14:22 +0000] "PUT /admin/v2/persistent/public/__kafka_producerid/__transaction_producerid_generator
HTTP/1.1" 307 0 "-" "Pulsar-Java-v2.10.1" 18
...
Jul 20 20:14:22 rhel7.localdomain bash[11095]: 2022-07-20T20:14:22,523+0000 [AsyncHttpClient-70-1] WARN org.apache.pulsar.client.admin.internal.BaseResource - [http://localhost:8080/admin/v2/persistent/public/__kafka_producerid/__transaction_producerid_generator] Failed to perform http put request: java.util.concurrent.CompletionException: org.apache.pulsar.client.admin.internal.http.AsyncHttpConnector$RetryException: Could not complete the operation. Number of retries has been exhausted. Failed reason: Maximum redirect reached: 5
Jul 20 20:14:22 rhel7.localdomain bash[11095]: 2022-07-20T20:14:22,523+0000 [main] ERROR io.streamnative.pulsar.handlers.kop.utils.MetadataUtils - Failed to successfully initialize Kafka Metadata public/__kafka_producerid
Jul 20 20:14:22 rhel7.localdomain bash[11095]: org.apache.pulsar.client.admin.PulsarAdminException: java.util.concurrent.CompletionException: org.apache.pulsar.client.admin.internal.http.AsyncHttpConnector$RetryException: Could not complete the operation. Number of retries has been exhausted. Failed reason: Maximum redirect reached: 5
Expected behavior The standalone cluster should be able to get restarted, without the error above being displayed.
Additional context This was reproed on a RHEL VM started up on vagrant. Dependecies:
- apache-pulsar-2.10.1
- kafka_2.12-3.2.0