zebra icon indicating copy to clipboard operation
zebra copied to clipboard

Adds logging of column family size and database size on startup and s…

Open elijahhampton opened this issue 1 year ago • 3 comments

Motivation

What are the most important goals of the ticket or PR?

This PR allows for zebra users to monitor memory usage of the zebra node. Users can now monitor database and column family sizes. This PR addresses the following issue: https://github.com/ZcashFoundation/zebra/issues/7416

PR Author Checklist

Check before marking the PR as ready for review:

  • [X] Will the PR name make sense to users?
  • [ ] Does the PR have a priority label?
  • [ ] Have you added or updated tests?
  • [X] Is the documentation up to date?
For significant changes:
  • [ ] Is there a summary in the CHANGELOG?
  • [ ] Can these changes be split into multiple PRs?

If a checkbox isn't relevant to the PR, mark it as done.

Specifications

Not applicable.

Complex Code or Requirements

No.

Solution

A function was added to the DiskDb struct to get metrics from each column family handle, print each metric and also calculate the total metric across live database size, total sst files size, and size of tables in memory.

A call to log_db_metrics was added right after the state service initialization. This ensures that as soon as the database is initialized and ready, its metrics are logged.

To handle various shutdown scenarios (e.g., graceful shutdown, errors, SIGINT), the logging of metrics at shutdown was encapsulated within the Drop trait for StateService. The Drop trait's drop method is automatically called when an instance goes out of scope, making it a reliable place to perform cleanup tasks and final logging actions.

Finally, the logic to build the column families vector was encapsulated into a function construct_column_families.

Testing and Review

Testing can be manually completed. See testing instructions below. This PR is not blocking any other work.

Testing instructions Manually compare the total with the size on disk using du, and the size in memory using top.

RocksDB uses extra files for old data and deleted data, so the RocksDB disk sizes should be smaller. Live disk should also be smaller than total disk.

Zebra uses memory outside RocksDB, so the RocksDB memory usage should be smaller.

Reviewer Checklist

Check before approving the PR:

  • [ ] Does the PR scope match the ticket?
  • [ ] Are there enough tests to make sure it works? Do the tests cover the PR motivation?
  • [ ] Are all the PR blockers dealt with? PR blockers can be dealt with in new tickets or PRs.

And check the PR Author checklist is complete.

Follow Up Work

elijahhampton avatar Mar 02 '24 19:03 elijahhampton

Thank you for the PR! We will try to review shortly (sometime this week)

mpguerra avatar Mar 04 '24 09:03 mpguerra

It seems we need to add the print_db_metrics to the ZebraDb structure. I think this can be just something as:

    pub fn print_db_metrics(&self) {
        self.db.print_db_metrics();
    }

But i did not tried it, let me know if you want me to try to fix that part.

https://github.com/ZcashFoundation/zebra/actions/runs/8246495198/job/22552667617?pr=8336#step:6:522

oxarbitrage avatar Mar 12 '24 14:03 oxarbitrage

It seems we need to add the print_db_metrics to the ZebraDb structure. I think this can be just something as:


    pub fn print_db_metrics(&self) {

        self.db.print_db_metrics();

    }

But i did not tried it, let me know if you want me to try to fix that part.

https://github.com/ZcashFoundation/zebra/actions/runs/8246495198/job/22552667617?pr=8336#step:6:522

Okay, I just saw these comments. I will take care of this asap. Of course feel free to edit/fix anything!

elijahhampton avatar Mar 12 '24 22:03 elijahhampton

I created a follow up ticket to convert bytes to human readable form at https://github.com/ZcashFoundation/zebra/issues/8380

oxarbitrage avatar Mar 26 '24 11:03 oxarbitrage