lakeFS
lakeFS copied to clipboard
Extract permission node creation 8201
Related to #8201
Change Description
Automating the documentation for endpoint permissions requires to decouple the permission objects from the endpoint code. This PR extracts the permission objects to a separate file. Since the permission code is sensitive, I opted for a straightforward refactor as a first step. This also makes it possible to test the next changes in the permission code separately from the endpoint code. The functions are named after the logAction operation names when available (e.g copy_objects -> CopyObjectsPermissions).
Testing Details
When refactoring I used the IDE's "Extract Method" feature for consistency and minimize room for error. I also ran the tests locally (excluding Hadoop related tests, which I had an issue with running locally). Let me know if you think additional coverage is required.
@arielshaqed thanks for reviewing!
My initial thought was doing something similar to the draft here. The tldr version is along the lines of mapping each operation name to a permissions object that has the resource template (e.g "group/{groupID}") in the resource field. The doc generation part will use this object as is, and when used in the endpoint flow it will call a "fill" function that substitutes the placeholders with the actual values.
If I understand your suggestion correctly, the doc-gen code is supposed to call the function that MakeGroupPermissions (etc) returns, which means it needs to know how many arguments this function expects. It's not impossible to solve, but I think the template solution is easier in this case.
Unrelated question - there's an automatic check that fails this PR for not closing an issue. I'd rather split the code to multiple PRs (I also want to extract the log_action strings to enums before making the map object, which I don't think belongs here). In that case, this PR won't close the original issue. Should I create another issue for this PR to close? Thanks
Hi @arielshaqed thanks for the review.
-
honestly, I didn't realize that the function names in the controller should match the operationIds, so as I mentioned, I named the functions after the logAction strings. I corrected that now.
-
regarding how this PR helps: my next step is to construct the operationId -> permission-node map, where the resource parts of the nodes are templates. So the docgen flow can use these nodes as-is, and the controller flow will fill these templates with the request data.
I intend to replace the code in the operationPermissions functions to use and fill the nodes from the map. I did not include it in this PR because I thought it would be even harder to review and verify. Additionally, if the current operationPermissions functions are correct, it will be easy to verify that the new versions (that use the map) return the same results.
Part of my assumption was that this PR would be hard to review and verify automatically, so I tried to make the changes as straightforward as possible (extracting code from each controller function to a helper function). I am open to suggestions how to to make such PRs more reviewable in the future. I am also curious to see how you plan to automate the verification that the pre-pr and post-pr code do the same, I didn't think of a good way to do so myself.
Thanks!
Hi @tkalir , Apologies for another delay - I've been unwell. Better now, so will reply by Sunday.
I will write a script to extract permissions used for each call. I think that may give us confidence we're doing the right thing here.
Here is a more concrete proposal. The chief driving force here is reliability: reduce the risk of human error when extracting the correct permissions by automating it a bit. So I wrote a dumb script that just manages to extract the permissions. Because this is a forklift (we replace one thing with another, but we do not plan on doing the same replacement again), it is allowed for the script to fail and say that it failed in some cases. This is better than making a silent error.
After wasting some time and burning the planet using Gemini Code (pro tip: it doesn't work), I hacked together the attached Python script and ran it on controller.go. It manages to extract permissions for most of the API functions - we can do the remaining 4 manually. Because of reasonably strict error checking in the script, I feel that the automation here is trustworthy.
I think next up we can extract permissions. We will need to extract permissions to a more usable object. For instance, if we had an interface Allower with a method Allowed(permissionParams, r http.Request, w *http.Response) then we could have a map element "create_presign_multipart_upload" -> AllowByPath(...).
An implementation of AllowByPath would be something like this:
type AllowByPath struct {
Action string
}
func (a *AllowByPath) func Allowed(params PermissionParams, r http.Request, w *http.Response) bool {
return a.Controller.AuthorizeCallback(w, r, permissions.Node{
Permission: permissions.Permission{
Action: a.Action
Resource: permissions.ObjectArn(params.Repository, params.Path),
},
})
}
Once we've done that, we can generate documentation by adding another method to Allower. And we can order the documentation correctly by ordering all the function names output in permissions.out.
WDYT?
Thanks, this looks like a solid idea.
My main concern with this approach is how to verify the code's correctness after the change. In the last PR, I relied on the IDE's refactoring tool to reduce the chance of human error when extracting the nodes' code into functions. With this approach, the IDE won't be as much help. While this approach does result in a lot less new functions, I think the new objects + the changes in the calling code still results in many changes that would be hard to check manually.
One option for mitigating that could be verifying that Authorize() receives the same input when the same endpoints are called with the same parameters. To try this, I called LinkPhysicalAddress() with a fake Controller object whose Authorize() logs the permission object, and it worked. So what I could do is:
- create a fake controller that logs Authorize()'s input to a file
- call all endpoint handlers
- write the new code
- call the endpoint handlers again
- check that the outputs are the same
I suspect the test code would have to change a little for each endpoint, so the test code won't be small, but it is doable. It might be an overkill, but since this touches security code along all handlers, it might be justified. What do you think? Is there a simpler way to test this change?
Since we have reproducible automation, I suggest we:
- Review the Python script (I wrote it, so either you review it or I find a third person).
- Sample 10 of its outputs.
- Go on from there.
(Of course change the script if you need to tweak its output).
I do believe the script works as intended - but my concern is also with the boilerplate that the new code will require, which the script won't help with verifying. But we can postpone this discussion to after I write the code. (Or, if you prefer, I can start with doing only the flows involving AllowByPath to make the review more manageable)
Hi @arielshaqed ,
Working on the code, I noticed that some endpoints pass a callback to AuthorizeCallback, which will need to be passed to Allowers. It can certainly work, but made me wonder if we can simplify the design.
This suggestion assumes that the only real difference between Allowers is the Node object they use, let me know if you think that might change. Otherwise, I suggest storing in the map objects like these:
type PermissionDescriptor interface {
Permission(params PermissionParams) permissions.Node
<doc generation function>
}
type PathPermissionDescriptor struct {
Action string
}
func (p *PathPermissionDescriptor) Permission(params PermissionParams) permissions.Node {
return permissions.Node{
Permission: permissions.Permission{
Action: p.Action,
Resource: permissions.ObjectArn(params.Repository, params.Path),
},
}
}
Then, instead of calling authorize() directly, we can wrap it in a function that receives OperationId and all the relevant params, and have it handle the map and call authorize(). This way there's less chance that changes to the flow will ripple to every endpoint.
What do you think?
Hi @tkalir ,
Really glad we're back on this!
Sorry, I am not sure about this part of your suggestion; I am probably missing your point in all of this.
return permissions.Node{ Permission: permissions.Permission{ Action: p.Action, Resource: permissions.ObjectArn(params.Repository, params.Path), }, }
There are many more types of Nodes, in actual use! I mean, many Nodes refer to other ARNs than object ARNs, and we sometimes use "AND" Nodes.
I think you refer to more than what I read -- and would be glad if you could help me understand.
Hi, I'll explain a bit more.
This suggestion is similar to your last one - having all objects in the map share an interface (PermissionDescritor), with the same number of implementations. Meaning, every time the permission node is the same ARN structure with a different action, they share an implementation. Times when AND is used may require a different implementation each.
Comparing to the Allowers design, it's just about splitting a bit differently what's in the interface implementation and what's in the calling code. Instead of having each implementation calling AuthorizeCallback with a different node object, I am having each implementation returning a different node, which the calling code passes to Authorize().
Some more examples:
type BranchPermissionDescriptor struct {
Action string
}
func (b *BranchPermissionDescriptor) Permission(params PermissionParams) permissions.Node {
return permissions.Node{
Permission: permissions.Permission{
Action: b.Action,
Resource: permissions.BranchArn(params.Repository, params.Branch),
},
}
}
example using AND:
type CreateRepoPermissionDescriptor struct {}
func (c *CreateRepoPermissionDescriptor) Permission(params PermissionParams) permissions.Node {
return permissions.Node{
Type: permissions.NodeTypeAnd,
Nodes: []permissions.Node{
{
Permission: permissions.Permission{
Action: permissions.CreateRepositoryAction,
Resource: permissions.RepoArn(params.Repository),
},
},
{
Permission: permissions.Permission{
Action: permissions.AttachStorageNamespaceAction,
Resource: permissions.StorageNamespace(params.StorageNamespace),
},
},
},
}
}
(PermissionDescriptor seems too long, if we use this design we can go with "OpPermission", "PermissionSpec" or something similar) Makes sense?
Thanks for the explanation - this really makes a lot of sense! So most of the logic I thought you needed to put in the PermissionDescriptor, you actually put into the factory (constructor) functions Permission, right?
You've obviously put a great deal of understanding into our requirements, even those I did not phrase formally. And I think this proposal offers a great path forward.
I am asking for some clarifications, to be sure that I understand and to try to avoid some blockers. I really think we can manage all of these, either this way or another way.
IIUC this means PermissionParams will have many more fields, for branches, source objects (for copy), repositories, users, groups, tags, and all that. That sounds great, I had not thought of it like that. If I now understand correctly, my only request is that PermissionParams fields will be pointers to strings not strings: I would like code in controller.go to be very clear about the difference between "I don't have a parameter value to give here" (nil) and "I have a parameter value to give here, and it is the empty string" (a pointer to an empty string). I realize it's probably OK because empty strings probably don't make sense for any of the parameters. But... maybe they do? Maybe an empty path component could make sense in future? If we do it with pointers we will never have to worry about it.
Creating a pointer to a string can be unpleasant in Go, particularly when you have a literal or a computed string. We use swag.String(...) which does that for us; you could use that in your upcoming code too.
Hi, I was sick for a few days so it got delayed. Thanks for the feedback!
Regarding where the logic is located, to make sure we are on the same page, my suggestion is to replace Authorize() with a function like this
func (c *Controller) authorizeReq(w http.ResponseWriter, r *http.Request, params PermissionParams, operationId string, opts *AuthorizeOpts) bool {
desc := GetPermissionDescriptor(operationId)
callback := writeError
if opts != nil && opts.callback != nil {
callback = opts.callback
}
return c.authorizeCallback(w, r, desc.Permission(params), callback)
}
// currently custom callback is needed only in ~2 endpoints, but having opts reduces
// the chance of needing to change the function's signature in the future
GetPermissionDescriptor just looks in the map for a PermissionDescriptor object, using the operationId as key.
The map is populated by (for the examples from my last message)
"CreatePresignMultipartUpload" -> &BranchPermissionDescriptor{action: permissions.WriteObjectAction}
"CreateRepo" -> &CreateRepoPermissionDescriptor{}
So not a lot of logic overall.
Some additional points:
-
as you said,
PermissionParamswill use multiple fields for tags, branches, etc. I think that currently there won't be a use for having nil values there, but future-proofing won't harm us, I'll do as you suggested. -
which keys to use in the map - I suggest using the operationId names, like
createPresignMultipartUpload. When generating the docs we will need to fetch info from the swagger (like the endpoint url for the operation), and the swagger uses operationIds. There are a few possibilities: a - using hard coded strings as keys in the map b - generateoperationIdconstants from the swagger and use them as keys c - generateOperationobjects from the swagger, which will include the operationIds and endpoint urls (so parsing the swagger in the doc gen part won't be necessary). If we want to create operation->logAction mapping, the logAction names can be added as custom fields in the swagger, for the times where getting logAction from operationId is not trivial. In this option we use<operation>.idas key.
A or B will be easier, as this issue doesn't necessitate dealing with the logaction names, but tell me what you think.
Glad to hear you're better!
I'm happy with any of the options. Personally I do not see much benefit in consts / string enums / int enums compared to just copying the operation ID. I mean, it basically means writing stuff like
func GetObject(...params) {
something.Permissions(GetObjectActionName, ...)
// ...
}
rather than
func GetObject(...params) {
something.Permissions("get-object", ...)
// ...
}
The minor advantage to the first is that it catches some errors at compile-time rather than at run-time. But:
- If you write "geet-object" with an error, then any test of the code will fail. And I really hope we test our changes by running them at least once 🙀
- A much more common error for me is to copy-paste and end up with "put-object" instead of "get-object". And then I'd be as likely to write PutObjectActionName.
So no strong feelings here.
thanks!
I think another benefit of having constant variables is that it makes it easier to find where in the code the string is used (for this flow's purpose). I mean, the operationid might appear in multiple places in the code base, but the constant appears only in this flow. However, since you don't have a preference and this task is quite big as it is, I think I'll go for the simpler solution.
I think my next step will be doing a PR for migrating the Path-using endpoints to this method, and then we can talk testing with concrete examples.