Added CompositeCodec to support reading/writing composite columns
This codec allows for operations on composite columns, which is currently not supported in Cassie. Example usage:
val cluster = new Cluster("localhost", 9160)
val keyspace = cluster.keyspace("test").connect()
val compositeTest = keyspace.columnFamily("CompositeTest", Utf8Codec, CompositeCodec, Utf8Codec)
.consistency(ReadConsistency.One)
.consistency(WriteConsistency.One)
val composite = Composite(Component("c1", Utf8Codec), Component(2, LongCodec))
compositeTest.insert("testkey", Column(composite, "testval2"))()
println("row ==")
compositeTest.getRow("testkey")().foreach(c => printCol(c._2))
println("one col ==")
compositeTest.getRowSlice("testkey2",
Some(Composite(Component("c1", Utf8Codec, ComponentEquality.EQ))),
Some(Composite(Component("c1", Utf8Codec, ComponentEquality.GTE))),
Int.MaxValue)()
.foreach(printCol)
// Decoding a composite
val decoder = Decoder(Utf8Codec, LongCodec)
def printCol(c: Column[Composite, String]) = {
val colName = decoder.decode(c.name)
val c1 = colName._1.value
val c2 = colName._2.value
println(c1 + ":" + c2 + " = " + c.value)
}
In addition to the above detail-oriented comments, I have a couple of high-level comments–
- This only supports composite columns where all the components are the same type. That seems very limited.
- We should move all the encoding logic into the codec. If thats possible.
No problem on the detail changes, and yes private[this] is technically what I meant, and I'm happy to make that change. In practice I find it makes little real difference, and it makes the code look more cluttered. But I get your rationale. Good catch on the var/mutable.
Regarding heterogeneous types, this is supported if you use the ByteArrayCodec, in which case you're responsible for whatever encoding you want to do prior to constructing the Composite. We follow this pattern in some cases. I originally started down the path of passing in a list of codecs, then handling the encoding inside CompositeCodec, but I didn't do this because it really complicates the typing. Look at Hector's implementation if you want to see what I mean. I do agree that it would make the client code cleaner.
This latest revision moves the encoding scheme into the component rather than being at the codec level. The issue with composites is the complex typing involved. Specifically, you may write a composite of type Long:UUID:UTF8, then query only the Long portion (but using multiple components of type Long to generate your range predicate), then read the entire thing. The current codec mechanism doesn't provide this much flexibility, and type erasure prevents us from doing much in the way of runtime magic.
So after much wrangling, this is my solution:
- Let the component handle its own encoding, rather than trying to pass in the encoding scheme(s) to the CompositeCodec. The side effect here is the codec has no idea what the encoding is when you read it back out of Cassandra. Which leads to...
- When the composite is decoded, it doesn't try to apply the correct encoding because it has no idea what you want. This is in fact the hardest problem to solve gracefully, and has led to things like Hector's dynamic composites. Rather than employ some non-standard type encoding mechanism, I elected to use an external decoder that can translate a generic Composite (comprising only ByteBuffers) to a type-specific DecodedComposite. The advantage over the previous solution is that you can decode the composite as a whole, rather than having to decode every component individually. This is an improvement (IMHO) over Hector's standard composites, where you have to give it a list of serializers on every call.
I look forward to your feedback...
I should note that I have not successfully built Cassie, and this has been tested as a bolt-on component in our own codebase. It should compile correctly, but I can't test to make sure.
I have create a standalone CompositeCodec here: https://github.com/TheWeatherChannel/cassie-composite.
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
Robbie Strickland seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.