ozone HDDS-11432. Create ozone repair ldb put command for om.db

What changes were proposed in this pull request?

A repair command that adds a record to one of the tables in the om.db. The command will take key and value as input, and put the record in the db. If a record with the key already exists, then it will be overwritten.

In this PR, the support for the following table are added: volumeTable, bucketTable, keyTable, openKeyTable, fileTable, openFileTable, directoryTable, deletedDirectoryTable, s3SecretTable.

The command introduced:

Usage: ozone repair ldb put [-hV] --cf=<tableName> [-d=<dnDBSchemaVersion>]
                            [--k=<key>] [--v=<value>]
CLI to put a record to a table.
      --cf, --column_family, --column-family=<tableName>
                             Table name
  -d, --dnSchema, --dn-schema=<dnDBSchemaVersion>
                             Datanode DB Schema Version: V1/V2/V3
  -h, --help                 Show this help message and exit.
      --k, --key=<key>       Key of the record to be added to the DB
      --v, --value=<value>   Value of the record to be added to the DB
  -V, --version              Print version information and exit.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-11432

How was this patch tested?

Tested manually with a db. The output can be seen in this gist as it is too verbose to be added directly in the PR description/comment: https://gist.github.com/Tejaskriya/a39ecf26663e457864043e762d3f40d9

Sep 08 '24 18:09 Tejaskriya

@Tejaskriya marking this as a draft for now.

[ ] What is the serialized format for the value that needs to be provided to the CLI. Also the CLI help should explain how someone can specify the value. Please include examples
[ ] There are Tests missing and this one definitely needs robot tests as well as unit and integration tests.

Sep 09 '24 16:09 kerneltime

What is the use case for this command? Is just a hypothetical "in case we need it in the future" type of thing or have we seen legitimate bugs where this would have helped? IMO this command is somewhat complicated and dangerous, even for a repair subcommand, and should not be added without a very strong reason.

Sep 09 '24 22:09 errose28

What is the use case for this command? Is just a hypothetical "in case we need it in the future" type of thing or have we seen legitimate bugs where this would have helped? IMO this command is somewhat complicated and dangerous, even for a repair subcommand, and should not be added without a very strong reason.

@errose28

There are multiple issues happened earlier,

HDDS-9876: replay of transaction, as recovery, need update transactionInfoTable
HDDS-7449: bucket encryption key overwrite
key filename corruption in DB due to concurrent cache update and persist (do not remember exact issue ID)
Many more issues where repair may be avoided using script

So for first 2, we have provided specific repair method via ozone OR new feature implemented to repair respectively.

This capability provides flexibility to delivery script as recovery based on any specific issue And Avoid repair command or new feature for any specific issue in particular version. Also any recovery can be provided without releasing as software itself which will be quick and avoid frequent releases.

Example, using shell script, it can query all bucket missing encryption and update with encryption key using the script as solution.

Also for transactionInfo update, instead of having separate repair command, script which can use this option to fix.

So idea is to avoid version specific bug repair as feature, and provide a script / steps to recover with generic options.

Sep 10 '24 06:09 sumitagrawl

using shell script, it can query all bucket missing encryption and update with encryption key using the script as solution.

Ok so the use case is not really about forming the json manually and putting those objects into the DB, but that we would read existing values using ozone debug ldb, modify them, and write them back?

This is where things get complicated, because we are assuming that no information is lost when going from proto -> json -> proto. All it takes is one @JsonIgnore in some object to mess this up and have this "repair" tool cause more silent bugs. It will be almost impossible to verify that every one of our hundreds of proto objects can survive this translation successfully and therefore verify that the command actually works. For that reason I still do not think this is a good thing to add.

Sep 10 '24 17:09 errose28

I agree with @errose28 on this. Opening up a way to modify the db directly without any restriction is not a good idea. It's very hard to have unit/functional tests to cover all the scenarios. We could end up in a situation where we won't even know how the data in the db got corrupted.

If it's must, it is better to have specific tool or repair command to fix the problem which users frequently run into.

Sep 11 '24 05:09 nandakumar131

I agree with @errose28 on this. Opening up a way to modify the db directly without any restriction is not a good idea. It's very hard to have unit/functional tests to cover all the scenarios. We could end up in a situation where we won't even know how the data in the db got corrupted.

If it's must, it is better to have specific tool or repair command to fix the problem which users frequently run into.

To handle same,

proto validation is used to validate data (basic one)
we are planning add audit log to record all changes done in DB itself (diff of changes if exist, new add record, or delete data) - HDDS-11447

The audit generally is not there in other repair command using tool, but this as unified approach for tracking changes via this will help in recording changes. This can remove the risk of corrupt and tracking and recovering on need basis.

But as support, instead of downloading rosckdb ldb tool and using same is having higher risk as done for immediate recovery of environment and may not be possible for certain cases.

Sep 11 '24 07:09 sumitagrawl

To handle same,

proto validation is used to validate data (basic one) we are planning add audit log to record all changes done in DB itself (diff of changes if exist, new add record, or delete data) - HDDS-11447

Neither of these address my primary concern: that data is lost in translation of proto objects to or from json. Protos will still be valid if fields are missing or values have drifted unintentionally, and the audit log depends on human readable representations of the data which will not expose any silent bugs in translation to/from human readable format.

But as support, instead of downloading rosckdb ldb tool and using same is having higher risk as done for immediate recovery of environment and may not be possible for certain cases.

The difference is that RocksDB's ldb is an external tool to interact with RocksDB directly. If someone is using that, it's clear that they are bypassing Ozone and removing any safety guarantees we provide. Once we add something to our CLI, even if it is under ozone repair which comes with it's own warnings, there is more trust by the user that it will work correctly. I don't think we can fulfill that requirement.

I think we might be able to safely(ish) add something like this if we depended on protobuf itself to do the json <-> proto translation, but that is not supported until proto 3 (javadoc, package versions)

Sep 12 '24 16:09 errose28

Protos will still be valid if fields are missing or values have drifted unintentionally, and the audit log depends on human readable representations of the data which will not expose any silent bugs in translation to/from human readable format.

@errose28 Thanks for your input, if have few points on this regard,

Related to existing repair, like update TransactionInfo table as repair is risky, can cause data loss if correct value is not provided. Considering this, repair is mostly a risky operation to handle certain bug.

Currently, any specific repair is delivered in next version or sub-version, where fix is already there about the bug. So it sometimes meaningless to provide the repair in next version. (Generic sanity check and its repair will be exceptional and advantageous for the scenario). Repair is required for issues in current version released.

Now we need a set of tool available to perform repair (even if risky as enviroment goes down). We sometime propose external tool but if that becomes habit, its ozone issue as suggested as alternative mechanism to perform recovery.

Proto3 provides better capability for proto and json conversion, But we operate on Service object, not proto directly. So approach followed,

Get Json from Object defined like OM objects or other service objects (not proto) and schema
populate Service objects (like OM objects)
build proto from OM objects and save

So what ever data validation is present in proto builder from pojo object Or corresponding proto schema, that will happen.

Direct proto as format -- not present for those tables, as can not convert JSON to proto directly -- Not supported in this PR. There are certain tables there of this kind not supported by this tool. Use Either Service Object OR Builder pattern as Service provide.

Audit log can try to provide maximum information as per object comparision in pojo object, but there can be some limitation due to human non-readable part. Based on those cases (OM-side its readable, but DN may have some part), HEX-Format can be supported to report those.

Please share your opinion for above use cases and scnearios regarding repair ...

Sep 13 '24 09:09 sumitagrawl

Related to existing repair, like update TransactionInfo table as repair is risky, can cause data loss if correct value is not provided. Considering this, repair is mostly a risky operation to handle certain bug.

The difference is whether the command works as the user intended. The transaction repair command will correctly update the term and index provided by the user as they intended with no unintended side effects. That is a different type of risk than this command, which may not work as the user intended. This command may silently drop fields already in the DB, either due to unintentional errors in user input or bugs in the translation path.

Repair is required for issues in current version released.

This is somewhat of a contradiction. Before a release goes out we would fix all the known bugs, instead of shipping with bugs and repair tools for them together. Our current way of handling this is not bad IMO. We build the repair tools when we discover a need later, and all repair tools can be built and run from the ozone-tools jar independently of the rest of the cluster. Building a generic solution like this does not fully solve the problem because it is not clear how any repair should be done. That would require either very careful docs (which we aren't usually good at) or new wrappers, which would also be built from master the same as ozone-tools already is.

Proto3 provides better capability for proto and json conversion, But we operate on Service object, not proto directly. So approach followed,

Our wrapper objects are for internal handling, but they are not a necessary abstraction when trying to repair the database directly, and actually get in the way. We want as few layers for things to go wrong as possible. We don't need to use these here and arguably we shouldn't. All complicated translations should be done by protobuf, which is not supported until proto 3.

So what ever data validation is present in proto builder from pojo object Or corresponding proto schema, that will happen.

As I already said, schema validation and whatever ad-hoc checks we may have are not exhaustive checks that the translation happened correctly.

IMO we should not implement something like this until completing proto 3 migration, which will handle the risky json translation for us.

Sep 13 '24 18:09 errose28

@errose28 As discussed, We need avoid proto-java-json and vice versa conversion to avoid and error which can be there in java object itself. Instead as part of this tool, will support direct read / put as proto and can be supported after upgrade to proto3.

So,

Need support query direct proto object as output
Same need support update proto directly to db

@Tejaskriya Will hold this till proto3 feature is added.

Sep 16 '24 17:09 sumitagrawl