greptimedb icon indicating copy to clipboard operation
greptimedb copied to clipboard

Table engine supports multi-region

Open v0y4g3r opened this issue 2 years ago • 7 comments

What type of enhancement is this?

Tech debt reduction

What does the enhancement do?

Currently in GreptimeDB, only one region can be created for a table on one datanode, which will cause data corruption when the number of table partitons exceeds the number of datanodes.

In fact, table engine supports creating multiple regions for one table, but we have to go through all those components on datanodes whether they can accommodate multiple regions.

Implementation challenges

To support multiple regions on datanode, both standalone and distributed mode should be considered.

In distributed mode, requests are parsed on frontend, and frontend forward queries to datanode. Thus, datanode's incoming requests must carry region info, for example, the region to insert, regions to query from etc. It should be relatively easy to add this parameters.

While in standalone mode, things get a bit tricky. If datanode has only one region, then query engine is not aware of partition rules. But if datanode has multiple regions for one table, then query engine should associate requests with corresponding regions according to partition rules. There are two problems here:

  • In standalone mode, catalog manager does not support storing partition rules, we must add an options field to SystemCatalogTable https://github.com/GreptimeTeam/greptimedb/blob/8402ef94af0043351cceecfa7620e0d8fe320be6/src/catalog/src/system.rs#L49
  • Find regions for queries according to partition rules. Currently it's implemented in DistTable. But in standalone mode, requests are directly forwarded to datanode. Thus we need to extract the partition logic to some separate struct to reused it in both distributed mode and standalone mode. https://github.com/GreptimeTeam/greptimedb/blob/8402ef94af0043351cceecfa7620e0d8fe320be6/src/frontend/src/table.rs#L85-L95

v0y4g3r avatar Nov 24 '22 03:11 v0y4g3r

MitoTable should be modified to accommodate multiple regions, and it's create/open/insert/select methods must explicitly provide regions.

Currently all partition rules are parsed in frontend and datanode does not validate if data inserted satisfies partition rule, but we may add this validation once repartition is supported.

v0y4g3r avatar Nov 28 '22 07:11 v0y4g3r

As for table scan, current Table::scan method does not provide a region argument:

https://github.com/GreptimeTeam/greptimedb/blob/c144a1b20ea775af1b174beec4e3dcbfcbb277f6/src/table/src/table.rs#L55-L65

Maybe we should add an regions: Option<Vec<RegionNumber>> argument.

v0y4g3r avatar Nov 28 '22 09:11 v0y4g3r

Maybe we should add an regions: Option<Vec<RegionNumber>> argument.

Vec<RegionNumber> should be adequate as None and an empty vector should represent the same thing.

evenyag avatar Nov 29 '22 02:11 evenyag

Maybe we should add an regions: Option<Vec<RegionNumber>> argument.

Vec<RegionNumber> should be adequate as None and an empty vector should represent the same thing.

Also, if there're regions with different schema in a table, then what will be the schema for record batches yielded? Can we compare the versions of schemas of all regions and find the schema with the min version?

v0y4g3r avatar Nov 29 '22 02:11 v0y4g3r

Can we compare the versions of schemas of all regions and find the schema with the min version

This might be the most applicable way before #275 is done. BTW, this reminds me that we need to move the version field somewhere else, as mentioned in #349

evenyag avatar Nov 29 '22 03:11 evenyag

But you may need to handle the projection field carefully and solve the difference of schema between table and region. What about returning an error if regions have different schemas in the first PR and leaving some work to the next PR?

evenyag avatar Nov 29 '22 03:11 evenyag

But you may need to handle the projection field carefully and solve the difference of schema between table and region. What about returning an error if regions have different schemas in the first PR and leaving some work to the next PR?

Projection is an issue. Looks better to just throw an error for now.

v0y4g3r avatar Nov 29 '22 03:11 v0y4g3r

Looks like region failover(#1126) also requires multi region support.

MichaelScofield avatar May 15 '23 06:05 MichaelScofield

Close this issue as version 0.4 already supports multi-region.

fengjiachun avatar Oct 17 '23 08:10 fengjiachun