horaedb icon indicating copy to clipboard operation
horaedb copied to clipboard

Implement table roles

Open waynexia opened this issue 2 years ago • 4 comments

Description

In distributed mode a table might be one of three roles: Writer, Reader and Querier based on the cluster's status. We need to implement these roles. The server only contains logic when a table is Writer, which can write data, append WAL, modify table options or perform compaction. Others two roles are

  • Reader which only replicates WAL from the corresponding Writer, and provides the same query ability as Writer
  • Querier which won't communicate with WAL. It is designed to only provides query ability on top of data in OSS. But we can let Reader provide those un-flushed fresh data in the future.

Proposal

This has serval parts:

  • How Role is represented Introduce a trait RoleTable and three impls WriterTable, ReaderTable and QuerierTable. The different logics of the same interface are encapsulated inside. This obstruction should only be used to group operation logic. The underlying TableData is the same one for different roles (for one table of course).
  • Changes to the existing mental level The current hierarchy is Instance -> Spaces -> Tables, I propose to insert RoleTable between Space and Table. A RoleTable (and its three implementations) holds one TableData, and provides top-level interfaces to operate on the data like write or read. We are operating those data directly in Instance.
  • How to manage the state of table's role RoleTable provides two mechanisms for syncing roles.
    • A atomic flag to represent current status. In most cases we are using RoleTable with type like Arc<dyn RoleTable>. That Arc handler cannot be changed quickly, thus an atomic flag is used to show what the actual status is. This mechanism is (designed) for rejecting calls to incorrect role. E.g., A table changed from Writer to Reader but the WriterTable handler is still kept by some tasks. We can set the flag and fail those tasks.
    • A notifier that sends a message on the last reference is dropped. Role changing is a long procedure. We might want to do something after the previous role's handles are fully dropped. This notifier is for that.
  • How does WAL replicate (#179) WAL is an important and complex part in this "role system". Writer wants to write to it and Reader should keep reading from it. The proposed workflow looks like the following, I plan to use one replicator per Instance to replicate all the tables:
          register IDs
         need replicate
     ┌─────────────────────┐
     │                     │
     │              ┌──────▼───────┐
┌────┴─────┐        │  background  │
│Role Table│        │WAL Replicator│
└────▲─────┘        └──────┬───────┘
     │                     │
     └─────────────────────┘
          replicate log
            to table

Tasks

  • [x] Implement WAL Replicator (#179)
  • [x] TableImpl fetch table from Instance dynamically (#202)
  • [ ] drop_table and close_table should require aTableRef
  • [ ] RoleTable trait's defination
  • [ ] Implement roles

Additional context

waynexia avatar Aug 08 '22 05:08 waynexia

In distributed mode a table might be one of three roles: Writer, Reader and Querier based on the cluster's status.

Those names are modeled after their relation with WAL, not role in a distributed system, in which those are usually called Master/Slave or Leader/Follower.

And Writeris kinds of misleading, since it can not only write, but also query. so naming may need reconsidered.

I suggest one possible naming style(with reference to Kafka):

  • Writer -> LeaderTable, like leader replica
  • Reader -> InSyncTable, like in-sync replicas
  • Querier -> NoSyncTable, like out-of-sync replica

jiacai2050 avatar Aug 09 '22 09:08 jiacai2050

And Writeris kinds of misleading, since it can not only write, but also query. so naming may need reconsidered.

Makes sense. Leader/InSync/NoSync looks more focused on the replicating roles 👍

I haven't implemented this and we can discuss it with more sparks ✨

waynexia avatar Aug 10 '22 02:08 waynexia

Maybe expose table role in system.public.tables.

chunshao90 avatar Aug 12 '22 08:08 chunshao90

Maybe expose table role in system.public.tables.

👍 sounds good

waynexia avatar Aug 12 '22 09:08 waynexia

The necessity of table role needs more discussion in the future, and maybe follower role won't be introduced for its complexity.

ShiKaiWi avatar Feb 09 '23 11:02 ShiKaiWi