confluent-cli
confluent-cli copied to clipboard
Confluent CLI doesn't use the standard Confluent Config with no warning
If you install CP from DEB/RPM, you have /etc/kafka/server.properties which says that the data dir is in /var/lib/kafka. So if you do “confluent start” to start kafka, you may expect your data to be there. It isn’t. CLI has a magic properties file in /tmp/confluent.WGTZu15Y/kafka/kafka.properties which uses /tmp/confluent.WGTZu15Y/kafka/data for the data.
This was a pretty big surprise to me, so we should be more explicit when we do that
+1 I hit similar problem, with connect-distributed.properties not being read from where I expected it to be.
There's a different cause for each of your observations.
Re: 1) one of the few overrides that CLI will do without requiring user's input is to overwrite the location of data directories. This is for instance dataDir
for zookeeper and log.dirs
for Kafka broker.
This was done to enable containerization of each deployment as well as to locate all the data of a Confluent CLI run in one place. It allows a user to have multiple, even concurrent, runs if ports are set appropriately. But most importantly it makes the CLI more robust, because an accidental change in the config won't result in confluent destroy
deleting data that potentially is not supposed to delete. A confluent cli run is supposed to control everything that is under the ${CONFLUENT_CURRENT}/confluent.current directory (CONFLUENT_CURRENT defaults to $TMPDIR if unset).
Re: 2) it made sense for the worker properties to have AvroConverter
as default converter for keys and values. Currently, to do that with minimal changes, we had to load worker properties from ./etc/schema-registry/connect-avro-distributed.properties
which I admit is not the first place somebody looks for Connect properties. Such a properties file that integrates with schema-registry can't go to ./etc/kafka
by merging it to AK. But we can think of ways to put a symlink in a more intuitive place in the Confluent platform packages.
I have to admit I see that problematic as well. Understand the reasoning but not happy with it.
- If the CLI does override something, it should tell.
- If you specify a log.dir(s) value in /etc/kafka/server.properties and it does not get used you are ... surprised.
- It is not clear to me at the moment which server.properties is going to be used. My current assumption is that the /etc/kafka/server.properties is copied to the CONFLUENT_CURRENT subdirectory and all has to be changed there.
- In production but even in docker you want to have a stable service, meaning you operate a single cluster. Not 10 people create and destroy constantly. No mention on how this is envisioned.
- A solution might be to provide a systemctl script. This is missing anyhow and could include all that is done when having a stable/not-changing system.
- confluent start should have the option to enable/disable components. I never want to use the rest proxy but all others. So I either have to start all individually or the rest proxy gets started as well.
We just hit the same when installing CP from RPM. What is the current word for running multi-node deployment (e.g. zookeeper) using the platform CLI? or is it unsupported for that (very common) case?
I agree:
- If the CLI does override something, it should tell.
- What is the current word for running multi-node? see https://github.com/confluentinc/confluent-cli/issues/46 ...what about the zookeeper myid file?
Placing data in /tmp is very smart idea guys. Especialy for RHEL-based distros... with tmpwatch turned on by default :) Keep up a good work!
@v-g-ustinov so long as you understand that Confluent CLI is currently just for development and not at all recommended for production then this does make sense.
@hjespers unfortunately the Quickstart guides on the fail to mention this. So it is easy to miss.
A nice big warning, 'Not suitable for production use', would solve this.
Totally agree @owenrh