cloudbreak icon indicating copy to clipboard operation
cloudbreak copied to clipboard

CB-18684: Adding core flow for determining the data sizes of DL data to be backed up.

Open sxxgrc opened this issue 3 years ago • 4 comments

Jira: https://jira.cloudera.com/browse/CB-18684

Please look at this section in my design document to understand the overall idea behind this work.

The primary objective of this is to build upon the changes in #13549 in order to expose a CB method which determines the data sizes of the services which will be backed up on the DL.

This work essentially consists of:

  • A new flow chain which updates Salt and calls a separate new flow (below)
  • A new flow which just calls the Salt orchestrator in order to run the new Salt script for getting the data sizes
    • This flow also interprets the result of the Salt operation in order to obtain the sizes

The final result of this entire operation is that the Stack status is updated such that the status reason contains the data sizes result, which can then be queried and obtained by an external service (in my next change I will create a method which does exactly this in order to make this as transparent as possible). This was done for 2 main reasons:

  1. To ensure that the method as a whole is asynchronous (i.e. the method returns nothing but starts the flow chain)
  2. To keep the modification as simple as possible: I did not want to introduce new DB fields or tables

I have tested this by creating a local DL and running the flow via its exposed endpoint with Postman 2. I then used PGAdmin 4 to ensure the stack status DB was correctly updated with the final result.

sxxgrc avatar Oct 03 '22 23:10 sxxgrc

Marking this as DO NOT MERGE as it is dependent on the changes in #13549 and I want that to be merged in first.

sxxgrc avatar Oct 03 '22 23:10 sxxgrc

This is now ready for reviews and to be merged.

sxxgrc avatar Oct 06 '22 15:10 sxxgrc

Marking this as DO NOT MERGE again as this is now dependent on a small Salt fix for the backing logic in #13582.

sxxgrc avatar Oct 08 '22 06:10 sxxgrc

As I understand you are using stack status to asynchronously pull the data on the data sizes. When is the stack status reset? How can we ensure that stack status is left with out being reset?

The stack status is reset within the change for #13551. I decided to separate the logic to keep it clean and understandable. The change in this PR ends when the stack status is updated with the result, and #13551 picks up on this and returns the result/clears the status.

I have verified that the status is not cleaned up otherwise but regardless added logic in the other change to account for cases where the status is modified.

sxxgrc avatar Oct 13 '22 16:10 sxxgrc

#13582 has been merged and this has been tested with the updated changes so this is ready to be merged whenever. Thanks.

sxxgrc avatar Oct 25 '22 16:10 sxxgrc

The final result of this entire operation is that the Stack status is updated such that the status reason contains the data sizes result, which can then be queried and obtained by an external service (in my next change I will create a method which does exactly this in order to make this as transparent as possible).

I don't understand why can't this be the first step on whenever we initiate a backup operation? That we there is no need to store this data as it's used immediately in the new flow steps and in the flow context. Let's not use a statusReason field as a control mechanism for other services. If it's really required to persist such data, CB should call DR service and save it there if it serves any purpose later on.

lnardai avatar Oct 28 '22 11:10 lnardai

A question arises in my mind: Why are hbase and solr sizes calculated from CB if the separated backup service can read from these services and it can backup data? I assume it can read the data sizes also. For example, the solr size calculation happens through API calls, so I presume the DR backup service should be able to call these APIs.

sodre90 avatar Oct 28 '22 11:10 sodre90

A question arises in my mind: Why are hbase and solr sizes calculated from CB if the separated backup service can read from these services and it can backup data? I assume it can read the data sizes also. For example, the solr size calculation happens through API calls, so I presume the DR backup service should be able to call these APIs.

Hello Peter,

Hbase does not expose APIs for it. We get access to Hbase to perform export/import through CM APIs. The only way to gather the information is by running commands on the VM nodes.

From the control plane, we can access the APIs exposed through Knox. we initially considered reaching the REST endpoints of Solr but the endpoints we want to access can not be accessed from CP.

kkalvagadda1 avatar Oct 31 '22 09:10 kkalvagadda1