scylla-operator
scylla-operator copied to clipboard
Add integrity information to must-gather
Description of your changes: This PR add integrity information to must-gather dump. This will help to identify whether files have been removed or the collection failed.
Which issue is resolved by this Pull Request: Resolves #1847
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: tnozicka
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [tnozicka]
Approvers can indicate their approval by writing /approve
in a comment
Approvers can cancel approval by writing /approve cancel
in a comment
I only had a brief look - wouldn't a merkle tree be a standard approach for this use case?
It's valid but more for cryptography.
The approach here allows you to easily identify the places with changes with respect to the particular resource and couples well with linux CLI tools. Say, to verify a folder manually, you can run find scylla-operator-must-gather/namespaces/scylla-operator/serviceaccounts -mindepth 1 -exec cat {} \; | sha512sum -
and you have the explicit count so you at least know how many files are missing.
@tnozicka: The following test failed, say /retest
to rerun all failed tests or /retest-required
to rerun all mandatory failed tests:
Test name | Commit | Details | Required | Rerun command |
---|---|---|---|---|
ci/prow/e2e-gke-release-script-latest | 68232b504f289b214df48a950faad6eab6c6d620 | link | true | /test e2e-gke-release-script-latest |
Full PR test history. Your PR dashboard.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.
I only had a brief look - wouldn't a merkle tree be a standard approach for this use case?
It's valid but more for cryptography.
The approach here allows you to easily identify the places with changes with respect to the particular resource and couples well with linux CLI tools. Say, to verify a folder manually, you can run find scylla-operator-must-gather/namespaces/scylla-operator/serviceaccounts -mindepth 1 -exec cat {} ; | sha512sum -
would this approach also allow you to check if a given file was originally a part of the archive?
and you have the explicit count so you at least know how many files are missing.
is that actually a valuable information if you don't know which ones are missing?
Since you already put the work into this I don't feel strong about using the merkle tree (altough there probable are many existing implementations), I'm struggling to understand what is the purpose of this exactly if we don't have the capability of signing it (correct me if I'm wrong here)? The PR description says This will help to identify whether files have been removed or the collection failed
, so it's working on an assumption that someone purposefully tampered with the archive, so they are already "malicious" (or non-compliant?) - what's to stop them from tampering with the integrity information as well? Would you be able to tell the modified archive apart from an archive in which some manifests just weren't collected?
would this approach also allow you to check if a given file was originally a part of the archive?
Transiently. But anything can be messed up with, it's just the amount of work that needs to be put in. Keep in mind this is meant primarily for detecting removed files, not to remove some and add fake files. At a point where the folder doesn't match we can just ask for a new collection.
We could extent this to go at the file level.
is that actually a valuable information if you don't know which ones are missing?
The most pressing concern is temper detection, after that point I'd usually just not proceed further.
Since you already put the work into this I don't feel strong about using the merkle tree (altough there probable are many existing implementations)
Using a merkle tree it is not easy or possible to use standard linux tooling for verification and you have to add extra commands. It's a valid option though.
what's to stop them from tampering with the integrity information as well?
nothing, with any approach we take. they are tempering with the archive with the intent to limit the data to what they think is not related (usually wrongly), not to be intentfully malicious.
PR needs rebase.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
is that actually a valuable information if you don't know which ones are missing?
The most pressing concern is temper detection, after that point I'd usually just not proceed further.
Are you saying that we wouldn't investigate a customer issue if they removed any file, even if we weren't sure it was necessarily related to Scylla?
what's to stop them from tampering with the integrity information as well?
nothing, with any approach we take. they are tempering with the archive with the intent to limit the data to what they think is not related (usually wrongly), not to be intentfully malicious.
I mean, I get the point, but at this point you'd have to assume they deleted a related file and assume a somewhat hostile approach. I'm not sure how that'd turn out in practice.
Before we go into reviewing the implementation, @zimnx what are your thoughts on this?