amazon-kinesis-agent
amazon-kinesis-agent copied to clipboard
Allow users to specify partition keys
Currently one can only select between random partition keys, or partition keys derived from the data. It should also be possible to specify a partition key associated with each file to be published.
Hello,
Are you saying that you want the partition key of each record coming from the same file to be the same? Like when you configure the Agent, you define a constant string as the partition key for each filePattern?
That is correct.
That sounds reasonable. I've created a task in our backlog. If we don't get time around it, you can also go modify the logic here:
https://github.com/awslabs/amazon-kinesis-agent/blob/master/src/com/amazon/kinesis/streaming/agent/tailing/KinesisRecord.java#L60
Would also love to merge the PR
@chaochenq It would also be great if you could apply a regex to the data to extract a partition key.
For example, consider the following payload:
{ "deviceId": "1234-5678", "weight": 52 }
To use deviceId as the partition key, you would specify something like the following in the agent.json:
"flows": [ { "filePattern": "/path/to/log/file", "kinesisStream": "data", "partitionKeyPattern" : "\"deviceId\"\s*:\s*\"([a-zA-Z0-9-]+)\"" } ]
The parentheses would create a matching group that would be the partition key.
@chaochenq Is anyone working on this ? Would like to support this feature
https://github.com/awslabs/amazon-kinesis-agent/issues/6#issuecomment-221072533
This seems like a good API
Anyone still working on this? I have a customer that could really use this feature. I see you have a branch that already supports this, will that ever be merged into the main branch?
@phspies or anyone were you able to successfully use these forked feature branches? I dont think either solution was ever merged back into awslabs. thank you
https://github.com/yang-wei/amazon-kinesis-agent/tree/add-partition-key-support https://github.com/Kryptoncloud/amazon-kinesis-agent/tree/feature/support-partition-key-pattern