kafka messagepack serialisation support
Hello , First of all it is a great project and we currently do use it as part of our production flow.
We are looking to implement kafka as middleware to store the metrics.
One of the options is to use go-carbon to read messages from kafka. I would like to write metrics to kafka using carbon-relay-ng (https://github.com/graphite-ng/carbon-relay-ng) but it does support only messagepack serialisation format.
As far as i know currently implemented protobuf, pickle and plain for kafka input.Do you have any plans to implement messagepack as well ?
May be you do know any projects which provide a good way of writing it to kafka in protobuf/pickle?
Do you have any ideas/implementaion of how to use kafka in metrics infrastructure stack?
Thanks in advance, Regards Vlad
I'd say that it's not known if anybody uses Kafka-based input at all.
As far as I remember, I initially contributed it because I wanted to test kafka-based delivery, but it never happened in a large-scale test (and as I've switched to another company I no longer have any business need to do those tests, only tests out of curiosity if I have spare time for that).
Adding support for msgpack serialization format should be rather straightforward and I'd say that PRs for that are more than welcome. You can use existing pickle or protobuf serialization implementation as a base.
Alternatively you can point to the protocol documentation and hope that either me or lomik or other people will have time to implement it for you.
Thank you for your answer. well, i would like to do some kind of POC at least with kafka options . since it looks a good way for have some reliable aws inter-region flow of metrics.
As for now i do not have as strong Golang background to contribute( python and ruby are my closer friends ), but i can share some context i have now :
-
here is msgpack specification : https://github.com/msgpack/msgpack/blob/master/spec.md
-
this is how it is implemented in Golang in another project: https://github.com/graphite-ng/carbon-relay-ng/blob/master/route/kafkamdm.go it does use lib: https://github.com/raintank/schema to implement msgpack
-
another implementation of msgpack: https://github.com/vmihailenco/msgpack
As far as i understand need to add one more parser here: https://github.com/lomik/go-carbon/tree/master/receiver/parse
From my side i can provide some help with testing with real flow of production metrics( i can send it to kafka using carbon-relay-ng project and can deploy some test kafka cluster for this)
Regards Vlad
I'm actually more interested in how it's implemented in other products, ideally graphite-web as it might be not possible to reuse raintank's schema directly: There is no License file attached to the repo, as I've opened an issue about that approx. 1y ago (https://github.com/raintank/schema/issues/17), but no response so far. Technically that means that the code is not free and those why cannot be used in any open-source project without permission of the author.
To implement that actually the structure of the message would be enough (what fields, what order, what field names).
I understand that likely that was not their intention and likely license should be something similar to Apache or MIT, but I can't know that for sure.
I've faced the same issue and solved it by using the https://github.com/vmihailenco/msgpack library.
After that, adding support for msgpack was super simple
package parse
import (
"github.com/go-graphite/go-carbon/points"
"github.com/vmihailenco/msgpack/v5"
)
// Datapoint loads a msgpack data
type Datapoint struct {
Name string `json:"Name"`
Value float64 `json:"Value"`
Time int64 `json:"Time"`
}
// Msgpack is used to unpack metrics produced by
// carbon-relay-ng
func Msgpack(body []byte) ([]*points.Points, error) {
result := []*points.Points{}
var d Datapoint
err := msgpack.Unmarshal(body, &d)
if err != nil {
return result, err
}
result = append(result, points.OnePoint(d.Name, d.Value, d.Time))
return result, err
}
Another note on the carbon-relay-ng to go-carbon compatibility: Even though support for msgpack was added I only seemed to get a handful of data. After much reading, I found that carbon-relay-ng does partitioning when posting data. go-carbon can only read from a specific Kafka partition id (default is partition 0).
To avoid updating the go-carbon Kakfa receiver code, carbon-relay-ng posts data to graphite kakfa topic. A python consumer reads from it ('graphite' topic), gets all data and re-posts it to a secondary topic 'metrics' on partition 0. go-carbon is then able to read from it and report all metrics