Adding stuctured output via new --format flag on most rpk commands.…
Cover letter
Adds json/yaml output to most rpk commands. I'm happy to do the rest but these are the ones I needed to build puppet config management around rpk. I also wanted to get the team's opinion before spending more time as I'm still learning Go and may be doing this incredibly inefficiently. I'm also not sure if dev is the right target for the PR, even if it was accepted. I'm happy for any and all comments/criticisms to make things better.
Backport Required
- [ ] not a bug fix
- [ ] issue does not exist in previous branches
- [x] papercut/not impactful enough to backport
- [ ] v22.2.x
- [ ] v22.1.x
- [ ] v21.11.x
UX changes
Add --format param to most rpk commands to format output as text, json, or yaml, with text being the default, which is the existing table style output that was already how rpk worked.
Release notes
- Added --format flag to most rpk command output that supports text/json/yaml as argument, defaults to text.
Features
- New --format param to most prk commands to support yaml/json output.
Improvements
- Allows monitoring and config management tools to wrap around the rpk command vs having to re-implement direct rest/kafka api clients to accomplish task the rpk cli already fully supports.
Rebased from latest dev to make sure merging is clean, assuming the branch gets accepted.
I think I've got vscode properly configured to auto format on save now. Sorry about messing up simple linter stuff so many times.
The last two failing checks mention "Sign-in with your Redpanda Data Inc. account to access Buildkite" when I click on the details link, so I assume those are not something I can address. If I'm wrong there, please let me know what I need to do.
Do I need to be doing something else to move this PR along?
I'll dive into fixing errors in the morning, and will rebase on the latest dev. For the commenting asking about "omitempty", did you mean all the struct tags, or just message/errors?
I'm actually not 100% positive and might need to bounce this idea off of @r-vasquez -- I think APIs usually optionally include an error, but always include other fields. I'd like to go with what's most common. If GCP / AWS APIs always include all fields, then we should too. If APIs commonly leave out empty fields, then we can leave out empty fields.
Latest push contains most of the changes suggested and a rebase from dev. I did not add the omitempty to all the json/yaml tags everywhere as it seems like that's not fully decided yet. If the team wants them added to all/specific tags, l'll update accordingly once that's decided.
I thought I had vscode setup to auto format on save... but apparently it's not working. When manually running format on the listed files, I saw the expected changes. I apologize for the unformatted commit and will look into vscode's config again.
Edit: the vscode-vim plugin for vs code doesn't fire save hooks on :w apparently. If anyone happens to be using that setup, be aware.
Rebased from latest dev branch.
Rebased from latest dev again.
Rebased again to keep things close to upstream.
Hello @alnet, don't worry about keeping the branch rebased (until you have a merge conflict), our CI rebase and test with the tip of dev so after reviewing + approving the PR I'll rerun everything.
Alrighty
There are mismatch also in rpk group describe vs JSON format:
$ rpk group describe g1
GROUP g1
COORDINATOR 1
STATE Empty
BALANCER
MEMBERS 0
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG MEMBER-ID CLIENT-ID HOST
oct-10 0 - 2 2
test 0 - 18 18
test-seek 0 - 9 9
$ rpk group describe g1 --format json
{"groups":[{"group":"g1","coordinator":{"node_id":1,"port":9092,"host":"0.0.0.0","rack":null},"state":"Empty","protocol_type":"consumer","protocol":"","members":[],"described_rows":{"rows":[{"topic":"oct-10","partition":0,"current_offset":"-","log_end_offset":2,"lag":"2","member_id":"","instance_id":null,"client_id":"","host":"","err":null},{"topic":"test","partition":0,"current_offset":"-","log_end_offset":18,"lag":"18","member_id":"","instance_id":null,"client_id":"","host":"","err":null},{"topic":"test-seek","partition":0,"current_offset":"-","log_end_offset":9,"lag":"9","member_id":"","instance_id":null,"client_id":"","host":"","err":null}],"use_instance_id":false,"use_err":false},"err":null}]}
Comments here:
- Coordinator JSON object includes
id,port,host, andrack, vs the table coordinator property which only includes theid. described_rowsare actually topics, described rows don't mean much to our end users. Same withrow.- Topics (row) include an
instance-idproperty, our table does not include that. - At the end of the JSON object we add
use_instance_idanduse_err property, these 2 properties don't mean anything to our end users and should be stripped from the JSON, those are internal.
I wasn't sure if emulating the table output as closely as possible, or providing more of the raw values coming back from the internal object was the right way to go. My initial thoughts were to provide it all in some meaningful way and let the consumer decide what they care about. This was the logic behind always including the detailed output when --format != text since the consumer could simply ignore that detail if they didn't care.
Is it okay to interpret this comment as "emulate existing table output?" and so go through my additions to only provide the same data as the text tables (detailed output included?)
I'll say "emulate existing command output" instead of "emulate existing table output". That means that we display the same info given the same set of flags in the command, either in text format or JSON format.
"emulate existing command output" is what I meant, wording was poor on my part. Do you want me to proceed with creating a new PR for just one command, and then move on with one PR per command to make the review burden easier? So first one would be adding
--formattorpk topic listas well as the formatted output of a new topic being added, as was given as an example easier that will have the changes only to theoutpackage and the code that uses it in topic list command, once that's polished up and accepted move on to the next, etc? Or would you like it broken down some other way, etc?
It may be a few days before I have time to work on this again so plenty of time to decide on how you'd prefer it sliced up and I'll hopefully get it done in downtime over the upcoming holidays.
We need to be consistent on 2 things:
- The information should be the same in all formats.
- The flags that modify the output, should modify the output in the other formats as well.
A couple of examples:
$ rpk acl list --format text
PRINCIPAL HOST RESOURCE-TYPE RESOURCE-NAME RESOURCE-PATTERN-TYPE OPERATION PERMISSION ERROR
User:my-user * TOPIC demo LITERAL ALL ALLOW
$ rpk acl list --format json
{
"acl":[
{
"principal":"User:my-user",
"host":"*",
"resource_type":"TOPIC",
"resource_name":"demo",
"resource_pattern_type":"LITERAL",
"operation":"ALL",
"permission":"ALLOW",
"error":""
}
]
}
$ rpk topic list --format text
NAME PARTITIONS REPLICAS
asdf 3 1
demo 1 1
$ rpk topic list --format json
{
"topics":[
{
"name":"asdf",
"partitions":3,
"replicas":1
},
{
"name":"demo",
"partitions":3,
"replicas":1
}
]
}
$ rpk topic list --detailed --format text
asdf, 3 partitions, 1 replicas
PARTITION LEADER EPOCH REPLICAS
0 1 39 [1]
1 1 39 [1]
2 1 39 [1]
demo, 1 partitions, 1 replicas
PARTITION LEADER EPOCH REPLICAS
0 1 3 [1]
$ rpk topic list --detailed --format json
{
"topics":[
{
"name":"asdf",
"partitions":3,
"replicas":1,
"partition_detail":[
{
"partition":0,
"leader":1,
"epoch":39,
"replicas":[
1
]
},
{
"partition":1,
"leader":1,
"epoch":39,
"replicas":[
1
]
},
{
"partition":2,
"leader":1,
"epoch":39,
"replicas":[
1
]
}
]
},
{
"name":"demo",
"partitions":3,
"replicas":1,
"partition_detail":[
{
"partition":0,
"leader":1,
"epoch":3,
"replicas":[
1
]
}
]
}
]
}
As the colleague (at the time) of @alnet who suggested and encouraged this solution to the problem of automating around the tooling - to work some structured data into the tool rather than parsing the text output. and to cooperate with you maintainers on getting a PR upstream - I find the progress all of you have made delightful. I'm sure this will help with the problems we were facing, and I feel confident this could help others including perhaps my new team.
I don't have much else to say at this point besides trying to send everyone in the collaborative back-and-forth on this PR some encouraging and thankful words.
It turns out my new team at my new employer would find this handy. Is there any movement on this PR?
Sadly no, our use case hasn't grown as much as I thought it would at work, and my free time at home has been cut into heavily in the past couple months.
We might have time to take ownership of this this quarter -- the biggest thing I think is for us to comprehensively revisit what JSON shapes we want to expose and then wire that through.
Closing this, we are beginning to add --format as well