RADAR-Backend icon indicating copy to clipboard operation
RADAR-Backend copied to clipboard

How to scale up partitions avoiding inconsistency

Open fnobilia opened this issue 7 years ago • 3 comments

How to scale up partitions

Current streams implementation is sensitive to partition number changes. It assumes that all messages for a given patient goes through the same partition for all the time.

Kafka spreads messages across partitions computing hash(key) % partition_number. If at runtime we change the partition_number (for instance at time t), then messages previously going to partition x, at time t+1 may now go to partition y. These two partitions may be assigned to two different streams. Meaning that: there will be a transient during which two streams concurrently update the hot storage view for the given patient. They will manage two time windows that may overlap, creating an inconsistent data for few seconds.

To scale up/down partitions avoiding this issue, we can set up three layers of topics:

  1. device topic
  2. temp topic
  3. business topic

Level 1 will be scale up/down according to load. A consumer groups will forward messages from level 1 to level 2. Streams will consume level 2. To scale up/down level 2, we need to

  • stop the consumer group, between level 1 and 2
  • stop streams, between level 2 and 3
  • scale up/down level 2
  • restart consumer group and streams from the early offset. This change is transparent to the users since, level 1 will be always on-line to serve their requests.

This approach entails a small overhead that should be worth to paying compared with the enhancement.

This issue does not involve the Cold Storage since it consumes the device topic.

fnobilia avatar Jan 23 '17 10:01 fnobilia