Raymond Cheng
Raymond Cheng
To be clear, this issue is mean to track dynamic resharding and failure recovery on the server-side. None of these should be visible to the client
We can start with running some profiling studies on application workloads
Currently, we assume 1 instance => 1 process => 1 GPU High level question: What is the best way to run multiple Talek instances (e.g. to support multiple data sizes)
Right now it's set up for chaining. e.g. leader -> follower1 -> follower2 -> etc. any server can choose to terminate the chain. Is that not sufficient?
Sure, conceptually it's essentially a frontend gateway (that can coexist with any one of the servers). So the leader = follower + frontendGateway