cdrs icon indicating copy to clipboard operation
cdrs copied to clipboard

Create connection from a string

Open Keats opened this issue 8 years ago • 9 comments

As mentioned on Reddit, I'm planning to add support for migrations for Cassandra in https://github.com/Keats/dbmigrate

Would it be possible to add a method to cdrs to get a connection from a string like cassandra://username:password@host:port/keyspace?

Keats avatar Feb 06 '17 07:02 Keats

@Keats It seems it makes sense to provide an option as a method of CDRS non-ssl mode. Non-SSL because in order to connect apart of addr and creds we need also certificate. Also obviously it should use password authenticator.

So, it might look like:

impl CDRS {
    #[cfg(not(feature="ssl"))]
    fn from_string(connection_string: &string)
}

USAGE:

CDRS::from_string("cassandra://username:password@host:port/keyspace")

There is one think here I don't understand at the moment it's how this keyspace will work. From one hand, to apply this space to each query by default we'll need to keep it somewhere inside CDRS. From other hand, it can be set from:

  • this connection string;
  • set keyspace query
  • query itself SELECT * from keyspace.table

To have this feature (keyspace) we may want to have some smart merge strategy for that.

Until it is clear we could provide from_string that could accept a connection string without keyspace cassandra://username:password@host:port

AlexPikalov avatar Feb 06 '17 08:02 AlexPikalov

It would be also cool if one could select keyspace in advance so it would not repeat in future queries.

e.g.: select * from keyspace_rembered.table ..

ernestas-poskus avatar Feb 06 '17 11:02 ernestas-poskus

@ernestas-poskus It's already possible via set keyspace query.

https://github.com/AlexPikalov/cdrs#use-query

http://docs.datastax.com/en/cql/3.1/cql/cql_reference/use_r.html

AlexPikalov avatar Feb 06 '17 12:02 AlexPikalov

Any update on that? I don't know about the keyspace itself as I've never used Cassandra myself, it seems similar to a schema in SQL databases? I was just looking at https://github.com/mattes/migrate/tree/master/driver/cassandra as a reference, maybe the Golang driver can be used to see how they implement it?

Keats avatar Feb 21 '17 12:02 Keats

@Keats : cassandra://username:password@host:port assumes that we have only one host; cassandra in production would have an array of hosts. It is a ring topology. At any point in time a node can die or a node would be added into the cluster. the client would ( not supported by cdrs currently) poll the cluster periodically to see which nodes are up/down and add/remove nodes ip from it's memory !!!

Say we have a 4 node cluster (10.10.10.10, 10.10.10.11, 10.10.10.12, 10.10.10.13) with a replication factor of 2. so 10.10.10.10 data would be copied on 10.10.10.11 and similarly 10.10.10.12 and 10.10.10.13 are pairs. we have provided cassandra://cdrs:[email protected]:9042 as intial cassandra connection string there is no real guarantee that 10.10.10.10 would be alive for forever; but since the data on 10.10.10.10 's copy is on 10.10.10.11 cassandra server would serve the data out of 11 and the client application using the driver shouldn't be worried about this fact as the driver would abstract this transition behind the scenes.

I know I have gone into a totally different tangent with my explanation; but does this explanation make any sense?

harrydevnull avatar Feb 26 '17 15:02 harrydevnull

I see, and I guess you will need to have the cluster to agree on a schema (I guess?) so you probably need to have special handling to wait till the that happens before running the next one (in my case of a cli to run schema migration). Thanks for the explanation

Keats avatar Feb 27 '17 04:02 Keats

yes precisely !!! providing an array of hosts (I deal with cassandra cluster with 50 nodes) in a string seems to be less ergonomic.

Fun fact

on a side note every node knows about all other nodes; so mentioning a single node would be enough; to execute cql statements. there I contradicted myself !!! Developers who wants to run application in production should not have a single node configuration; period ! if it a tool which run once in a while; if it fails we can change the host and retry. I don't know how could we incorporate these 2 orthogonal features into the same library.

harrydevnull avatar Feb 27 '17 16:02 harrydevnull

doesn't the driver discover the other nodes in the ring from the initial node(s) it connects to?

cmsd2 avatar Apr 15 '20 20:04 cmsd2

@cmsd2 Not yet, so far there is an option in excluding nodes which went down basing on server events

AlexPikalov avatar Apr 16 '20 16:04 AlexPikalov