go-crate icon indicating copy to clipboard operation
go-crate copied to clipboard

Add support for multiple back-ends?

Open jeteon opened this issue 9 years ago • 2 comments

I've been working on a new project using Crate. We were attracted to Crate primarily for the distributed features it offers with relatively little hassle. However, currently, this driver is set up to communicate with a particular instance and if that instance is down, there doesn't appear to be any way to fall back to the others in the cluster.

Would you be open to having the ability to handle multiple instances added? I'm willing to put in some of the work myself.

This could work either by having a number of URLs specified in the dsn or by querying the sys.nodes table once connected to a node or I suppose preferably both so the user need not specify every node up front but the application is also not stranded should the particular node its set up to communicate with be down.

P.S.: this is of course a prelude to other functionality such as distributing queries among instances.

jeteon avatar Dec 24 '15 09:12 jeteon

Yes, this is a feature we must support, I agree on receiving the cluster via dsn, although I see more utility on the "automatic cluster" feature (by querying sys.nodes).

One of the suggested approaches to achieve load balancing in Crate is by setting up a read only instance in the cluster, and query it instead, since it will be better suited to route the request to the correct node.

You could deploy this read only instance with the application code, so availability won't be a thing, but I can see this not being possible on a series of environments.

I'm not 100% sure what the best approach would be if we decide to manage the cluster state, and handle availability and distribution ourselves.

herenow avatar Dec 24 '15 13:12 herenow

I think you might need both the DSN and sys.nodes. Perhaps the DSN might specify only two of the cluster members (supplying one risks supplying a member that is out of service). Even if you deploy the read-only node on the same machine, you would not be able to start the application server in the ~1m that it takes to get a Crate instance up. Though I see your point.

I think as a start, simple fail-over support is the easiest thing to do. Basically, allow supplying a number of server URLs and then use the next one in a cyclic list when the current one isn't available. We wouldn't necessarily have to do fancy things like load-balancing as step one.

jeteon avatar Dec 27 '15 00:12 jeteon