system-design-and-architecture
system-design-and-architecture copied to clipboard
discussions of November
This thread records discussion happened in the chat group in November.
Delivery Guarantees
How to achieve exactly-once delivery?
Storm: Edge, Path, XOR
Imaging a data item flows through a DAG, how to ensure each data item go through each vertex only once?
Along the path, each edge has one starting vertex and one ending vertex. Like A -> C
in the following graph.
And per one edge and corresponding two vertices, one data item D1
will generate one rand ID 0010
. When the item goes through them, it will emit the same ID twice at those two vertices (0010
at a
and c
).
Similarly, after c
, another derived data D2
will have 1011
at c
and e
. D3
will have 1101
at c
and d
.
Finally, in the DAG, given one data item and its derived ones should have all rand IDs emit from those vertices XOR equal to 0.
MillWheel: starting vertex retry, ending vertex dup-check & ack
In DAG, imaging a -> b
, a
will keep retrying until receiving ack
from b
. b
will filter out duplicate messages received and then process, and finally, return ack
.
In GFS, how to ensure the serial order of concurrent writes?
https://users.cs.duke.edu/~chase/cps510/slides/gfs-etc.pdf
- The client asks the master for a list of replicas, and which replica holds the lease to act as primary.
- The primary will assign serial numbers for write requests, in case there are multiple clients requesting at the same time.
- For replication, primary forwards write requests with the same serial number to secondaries.
Gary
咨询技术大拿们个问题,有知道oracle 数据库(12C)数据,如何导入ElasticSearch 7么?历史数据存储在oracle里,计划导入elasticsearch做搜索,分析。以前2.X版本能用elasticsearch-jdbc 导入oracle 的数据。新版本不支持了。