cassandra icon indicating copy to clipboard operation
cassandra copied to clipboard

WIP: Initial implementation of cassandra-conf with nodetool example - CASSANDRA-17773

Open Claudenw opened this issue 3 years ago • 10 comments

This is a work in progress I am looking for comments

This change does 4 things:

  1. Moves the standard boiler plate from the standard scripts into a single maintainable script to be sourced in the original scripts.
  2. Creates a debug logging so that problem determination (as it CASSANDRA-17773) is easier.
  3. Has code to preserve environment variables when needed (c.f. nodetool script).
  4. Provides a verification method that will verify that all standard environment variables are set.

In practice this is a simple 2 line replacement for most boiler plate blocks.

Thanks for sending a pull request! Here are some tips if you're new here:

  • Ensure you have added or run the appropriate tests for your PR.
  • Be sure to keep the PR description updated to reflect all changes.
  • Write your PR title to summarize what this PR proposes.
  • If possible, provide a concise example to reproduce the issue for a faster review.
  • Read our contributor guidelines
  • If you're making a documentation change, see our guide to documentation contribution

Commit messages should follow the following format:

<One sentence description, usually Jira title or CHANGES.txt summary>

<Optional lengthier description (context on patch)>

patch by Claude Warren; reviewed by <Reviewers> for CASSANDRA-17773

Co-authored-by: Name1 <email1>
Co-authored-by: Name2 <email2>

The Cassandra Jira

Claudenw avatar Oct 26 '22 13:10 Claudenw

@mmuzaf You may be interested in this change.

Claudenw avatar Nov 08 '22 13:11 Claudenw

Some questions…

  1. isn't this a wrapper to either conf/cassandra-env.sh or bin/cassandra.in.sh ? the name cassandra-conf.sh is confusing…

  2. scripts should be silent when successful, by default

  3. you can't presume the file is in the same directory, or where anything is being executed from,

wrt (1), couldn't we instead pull a cassandra-env.conf out of cassandra-env.sh so to split the logic and the user-editable variables into separate files …?

michaelsembwever avatar Feb 25 '23 06:02 michaelsembwever

I fully agree with the third point, @michaelsembwever . That is what I was trying to communicate as well. I think that Claude reached the same conclusion about 1) and I dont have any objections with 2), definitely it should be silent when executed just fine to follow basic *nix principles.

Not sure about your last question. Maybe Claude can try to model that, just for the sake of seeing how that would look like.

smiklosovic avatar Feb 25 '23 19:02 smiklosovic

@michaelsembwever

  1. I agree the name is confusing, does anyone have a better name? The goal of this change is to simply remove the boilerplate from the other scripts while providing debugging capabilities to more easily detect the issue that caused CASSANDRA-17773. I did not make any change to operation of the scripts. Pulling a cassandra-env.conf out of cassandra-env.sh may be a reasonable thing to do but is outside the scope of this change.
  2. Script is silent by default. I think you are referring to the -x argument. That was included in error and was a debugging switch.
  3. This script has to run before before we can assume any environment variables are set as it sets many of them through the logic that pulls in the cassandra.in.sh script. I make 2 assumptions for Cassandra provided scripts.
    • The scripts in 'bin/ are not to be edited by end users
    • The scripts provided by Cassandra are all deployed in bin/. Thus the cassandra-conf.sh will be in the same directory as nodetool. However, I realise that there was an error in the way the file was included. The code in nodetool now uses realpath and dirname to figure out where cassandra-conf.sh is. This means that symbolic links to nodetool will work correctly. (other cassandra provided scripts in bin/ will be configured the same way)

For user created scripts I assume they know where bin/ is and will modify their scripts to source cassandra-conf.sh from there.

Claudenw avatar Feb 27 '23 09:02 Claudenw

I yet again want to reiterate the fact that environment property CASSANDRA_HOME might be set in user's environment and it is in fact set way before any script is executed so the third point is not entirely true.

smiklosovic avatar Feb 27 '23 11:02 smiklosovic

If CASSANDRA_HOME were set in every case then there would be no need for cassandra.in.sh to set it, and yet it does.
cassandra-conf.sh can not depend upon CASSANDRA_HOME being set.

In fact looking at an unedited sstableutil code you can see that CASSANDRA_HOME is not used and may be set by cassandra.in.sh

https://github.com/apache/cassandra/blob/98a055e51c8eb6b788d3d0fe94f3d2576ab14d39/bin/sstableutil#L19-L33

Claudenw avatar Feb 27 '23 12:02 Claudenw

@smiklosovic , @michaelsembwever any further comments on this?

Claudenw avatar Apr 03 '23 13:04 Claudenw

@smiklosovic , @michaelsembwever any further comments on this or should I go ahead an perform the changes on all the scripts?

Claudenw avatar Jun 27 '23 12:06 Claudenw

where did we get with this? @smiklosovic seems to say you agree, while you say it is out of scope ?

If CASSANDRA_HOME is set it must be honoured.

One example I know of, it needs to be possible to execute scripts in a random place against a CASSANDRA_HOME somewhere else. (tests do this)

-- otherwise… my concerns with this work is that we it requires a lot of eyes and testing, and that we won't get that before the August freeze.

michaelsembwever avatar Jun 29 '23 12:06 michaelsembwever

@Claudenw I would rather focus on this PR bellow, we need to be smart about time in the context of incoming freeze. That one is a new feature, this is "just" an improvement. AFAIK new features in a freeze window are forbidden.

https://github.com/apache/cassandra/pull/2282

smiklosovic avatar Jun 29 '23 16:06 smiklosovic