zot icon indicating copy to clipboard operation
zot copied to clipboard

feat(cluster): initial commit for scale-out cluster

Open rchincha opened this issue 1 year ago • 2 comments

What type of PR is this?

Which issue does this PR fix:

What does this PR do / Why do we need it:

If an issue # is not available please add repro steps and logs showing the issue:

Testing done on this change:

Automation added to e2e:

Will this break upgrades or downgrades?

Does this PR introduce any user-facing change?:


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

rchincha avatar Nov 13 '23 19:11 rchincha

Proposal:

  1. Single nodes have an upper limit on CPU, memory and storage. So eventually, have to grow out of single nodes.
  2. Repository "name" based static routing.
  3. Each node either can handle the request locally, or must route.
  4. HTTP redirect based (3xx) or proxy?
  5. Choose proxy in 4. since not clear if clients handle 3xx, however, there is an extra cost in proxy/networking.
  6. But nice in the sense that we can expose all nodes via DNS or a stateless ingress can hide and spray to these nodes.
  7. The cluster size can grow or shrink, which means the new lookup may not find data locally.
  8. However, since content-based lookups, we can add the other nodes as an implicit "sync" rule for a particular "name". If the data is not found locally, "sync" will get it from other members, and along with "retention", the cluster re-balances over time.

Challenges:

  1. List "all" repositories
  2. Cross-repository queries
  3. So graphQL has to account for this split state/view.

rchincha avatar Nov 13 '23 21:11 rchincha

previously i have worked with seaweedfs, which allows for side-adjustable clusters with discovery similar to how you describe. they use raft for this: https://github.com/seaweedfs/seaweedfs/blob/master/weed/server/raft_server.go

possibly a similar approach can be taken. they also are a sharded hash-addressable blob store.

that said, what is the rationale to implement your own cluster system? i feel like this is something better solved separately, since as a user i would probably feel more comfortable using zot on top of redis/ceph instead of making zot do the replication/sharding.

elee1766 avatar Nov 17 '23 00:11 elee1766