ch-backup icon indicating copy to clipboard operation
ch-backup copied to clipboard

construct named collections definition from system table

Open goncharovnikita opened this issue 9 months ago • 2 comments

Summary by Sourcery

Implement full backup of named collections by querying their JSON definitions, constructing CREATE NAMED COLLECTION statements, and uploading them directly via the storage loader.

New Features:

  • Return named collections as a mapping from name to full CREATE NAMED COLLECTION DDL instead of just names.
  • Add direct upload of named collections DDL data via the storage loader instead of using a local file.

Enhancements:

  • Introduce a static helper to build named collection DDL from JSON definitions.
  • Modify control logic to fetch both name and definition and map them to DDL queries.
  • Update backup layout and orchestration to pass collection data through to the upload method.

Tests:

  • Adjust integration tests to expect OVERRIDABLE clauses in the DDL and bump the required ClickHouse version to 24.3.

goncharovnikita avatar Mar 17 '25 13:03 goncharovnikita

@sourcery-ai review

Alex-Burmak avatar May 23 '25 04:05 Alex-Burmak

Reviewer's Guide

Enhanced named collections handling by retrieving full metadata from the system table, generating DDL statements dynamically, switching upload to raw data, adapting backup logic to pass collection data, and updating integration tests for new flags and version requirements.

Sequence Diagram for Updated Named Collection Backup Process

sequenceDiagram
    participant NCL as NamedCollectionsLogic
    participant CCTL as ClickhouseCTL
    participant CH as ClickHouse DB
    participant BL as BackupLayout
    participant SL as StorageLoader

    NCL->>CCTL: get_named_collections_query()
    activate CCTL
    CCTL->>CH: Query("SELECT name, collection FROM system.named_collections")
    activate CH
    CH-->>CCTL: [{name: ..., collection: ...}, ...]
    deactivate CH
    loop For each named collection
        CCTL->>CCTL: _create_named_collection_ddl_query(name, collection)
        CCTL-->>CCTL: ddl_statement
    end
    CCTL-->>NCL: Dict[name_str, ddl_statement_str]
    deactivate CCTL

    activate NCL
    loop For each name, ddl_statement in Dict
        NCL->>BL: upload_named_collections_create_statement(backup_name, name, ddl_statement)
        activate BL
        BL->>SL: upload_data(data=ddl_statement, remote_path, encryption=True)
        activate SL
        SL-->>BL: Success
        deactivate SL
        BL-->>NCL: Success
        deactivate BL
    end
    deactivate NCL

Class Diagram for Updated Named Collection Handling

classDiagram
    class ClickhouseCTL {
        +get_named_collections_query() : Dict~str, str~
        + _create_named_collection_ddl_query(name: str, collection: Dict) : str  ~$static~
    }
    class BackupLayout {
        +upload_named_collections_create_statement(backup_name: str, nc_name: str, nc_data: str) : void
    }
    class NamedCollectionsLogic {
        +backup(context: BackupContext) : void
    }

    NamedCollectionsLogic --> ClickhouseCTL : uses
    NamedCollectionsLogic --> BackupLayout : uses

File-Level Changes

Change Details Files
Implement full metadata retrieval and DDL generation for named collections
  • Select both name and collection fields in the SQL query
  • Change get_named_collections_query to return a name→DDL mapping
  • Add a helper to build CREATE NAMED COLLECTION statements from metadata
ch_backup/clickhouse/control.py
Switch upload of named collections to use raw data
  • Add nc_data parameter to upload_named_collections_create_statement
  • Remove local file path construction
  • Replace upload_file calls with upload_data using nc_data
ch_backup/backup/layout.py
Refactor backup logic to handle collection metadata dicts
  • Update logging to iterate over dict keys
  • Loop over name,data pairs and pass metadata to upload
ch_backup/logic/named_collections.py
Adjust integration test to new collection flags and version
  • Add NOT OVERRIDABLE/OVERRIDABLE annotations to create statements
  • Bump @require_version from 23.2 to 24.3
tests/integration/features/named_collections.feature

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an issue from a review comment by replying to it. You can also reply to a review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull request title to generate a title at any time. You can also comment @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in the pull request body to generate a PR summary at any time exactly where you want it. You can also comment @sourcery-ai summary on the pull request to (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the pull request to resolve all Sourcery comments. Useful if you've already addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull request to dismiss all existing Sourcery reviews. Especially useful if you want to start fresh with a new review - don't forget to comment @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

  • Contact our support team for questions or feedback.
  • Visit our documentation for detailed guides and information.
  • Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai[bot] avatar May 23 '25 05:05 sourcery-ai[bot]