citus icon indicating copy to clipboard operation
citus copied to clipboard

Idea: Shard by geolocation

Open JelteF opened this issue 6 years ago • 1 comments

Goal

Being able to store all records related to users in one geolocation only on workers in that geolocation.

Options

Create a distributed table for each geolocation

Need some way to wrap encapsulate the tables

Distribute a table by multiple columns

Elegant, but requires a lot of changes:

  • the planner needs to detect a join on two columns
  • range (geo_location) + hash (tenant) distribution would be nice

JelteF avatar Aug 21 '19 11:08 JelteF

Need some way to wrap encapsulate the tables

One approach would be to have hierarchical distributed tables. The top-level table is range-distributed by geolocation and its shards are on the coordinator itself. Each shard is a another distributed table in a separate co-location group.

The poor man's version of this would be:

CREATE TABLE global_events (location text, tenant_id bigint, event_payload jsonb, event_id bigserial, event_time timestamptz);                                          
CREATE TABLE global_events_1 (location text, tenant_id bigint, event_payload jsonb, event_id bigserial, event_time timestamptz);                                        
CREATE TABLE global_events_2 (location text, tenant_id bigint, event_payload jsonb, event_id bigserial, event_time timestamptz);
SELECT create_distributed_table('global_events_1', 'tenant_id', colocate_with := 'none');
SELECT create_distributed_table('global_events_2', 'tenant_id', colocate_with := 'none');
SELECT create_distributed_table('global_events', 'location', 'range');  

-- make sure the coordinator is added to itself
SELECT * FROM master_add_node('localhost', 5432);

-- set up the metadata for the shards
INSERT INTO pg_dist_shard VALUES ('global_events', 1, 'r', 'EU', 'EU');
INSERT INTO pg_dist_shard VALUES ('global_events', 2, 'r', 'US', 'US');
INSERT INTO pg_dist_shard_placement VALUES (1, 1, 0, 'localhost', 5432);
INSERT INTO pg_dist_shard_placement VALUES (2, 1, 0, 'localhost', 5432);

-- do a globally distributed transaction
BEGIN;
INSERT INTO global_events VALUES ('EU', 1, '{"hello":"world"}');
INSERT INTO global_events VALUES ('US', 1, '{"hello":"world"}');
SELECT * FROM global_events;
END;

Not all operations are supported:

TRUNCATE global_events;
ERROR:  canceling the transaction since it was involved in a distributed deadlock
CONTEXT:  while executing command on localhost:5432

ALTER TABLE global_events ADD COLUMN newcol int;
ERROR:  relation "global_events" does not exist
CONTEXT:  while executing command on localhost:5432

DELETE FROM global_events;
WARNING:  cannot use 2PC in transactions involving multiple servers
WARNING:  connection error: localhost:5432
DETAIL:  another command is already in progress
WARNING:  failed to roll back prepared transaction 'citus_0_39580_77_1'
HINT:  Run "ROLLBACK PREPARED 'citus_0_39580_77_1'" on localhost:5432
ERROR:  cannot use 2PC in transactions involving multiple servers
CONTEXT:  while executing command on localhost:5432

Once we implement #2830 we should be able to handle most operations (except COPY) in the coordinator backend, which would resolve a lot of these issues.

We should also implement #2512 to allow the coordinator to be in pg_dist_node without placing shards there.

marcocitus avatar Aug 21 '19 12:08 marcocitus