pulsar-helm-chart
pulsar-helm-chart copied to clipboard
Pulsar Manager Persistence in HerdDB
Is your feature request related to a problem? Please describe. Currently the Pulsar Manager is not persisted outside of the container it is running in. This means reconditioning of it is necessary should the helm chart be reinstalled. This is not a favorable solution.
Describe the solution you'd like
We can either provide a custom jdbc connectionstring to an external storage medium or use the zookeeper to store this data.
I think storing in the zookeeper should be the preferred solution.
The default value in values.yaml
for the key: pulsar_manager.configData.URL
(as well as others that need tuning) should be set according to: https://github.com/apache/pulsar-manager#default-test-database-herddb
Describe alternatives you've considered External DB: keeping data in Zookeeper puts Zookeeper as a central point of data and should therefore be favored.
I renamed Zookeeper -> HerdDB in the title. @Mortom123 I guess that's what you meant with persistence in Zookeeper? HerdDb uses both Zookeeper and Bookkeeper.
#343 has been merged. That provides persistence using a PVC and Postgres Db. I guess that might be sufficient for many use cases.
If it is possible to condogurr the jdbc url then storing data on BookKeeper using HerdDB is easy. HerdDB uses Zookeeper the very same way as Pulsar BookKeeper.
@eolivelli - I also thought that. Optimally, I would like to persist all of the data needed to run the pulsar manager in a single point in the cluster. I stumbled upon this setting here:
# HerdDB - start embedded server 'diskless-cluster' mode, WAL and Data persisted on Bookies, Metadata on ZooKeeper in '/herd', listening on localhost:7000
#spring.datasource.url=jdbc:herddb:zookeeper:localhost:2181?server.start=true&server.base.dir=dbdata&server.mode=diskless-cluster&server.node.id=localhost
And thought that the environment variable of the Pulsar Manager SPRING_DATASOURCE_URL
could be set to something like: jdbc:herddb:zookeeper:zookeeper.service:2181?server.start=true&server.base.dir=dbdata&server.mode=diskless-cluster&server.node.id=(randAlpha10)
to enable this.
This would mean the Pulsar Manager container itself does not need extra volume(-mounts) and data is at a central point in the cluster.
Does this work?