remote-apis How does GetCapabilities() interact with authorization & unknown instance names?

Assume I have a build cluster that supports writes to the AC, but only to certain users/clients.
- Is it permitted to return ActionCacheUpdateCapabilities that has update_enabled set to true for some of them, while returning false to ones that are unauthorized? Or should access controls have no influence in this matter?
- Nit: What is the default value of update_enabled, in case action_cache_update_capabilities in ActionCacheUpdateCapabilities is not set? I assume false? Why isn't this boolean embedded into the parent directly?
Does the same hold for ExecutionCapabilities's exec_enabled field?
- What does it mean if this field is not set? That it's unknown whether execution is supported?
Assume GetCapabilities() is called against an instance name that does not exist. Should an error be returned? If so, which one? NOT_FOUND or INVALID_ARGUMENT? Or is it allowed to succeed, returning a ServerCapabilities that has both execution_capabilities and cache_capabilities left unset?
What if the user is not authorized to interact with the instance name? Should we return PERMISSION_DENIED errors? This is in my opinion somewhat inconsistent with exec_enabled and update_enabled, as based on the previous questions those may also be used to convey authorization related decisions.

Long story short, I have the feeling that the semantics of the GetCapabilities() call are underspecified.

Feb 24 '22 13:02 EdSchouten

Definitely agreed that the Capabilities API is underspecified, and has never really fulfilled its purpose. My understanding of your questions, based on both historical context and (where applicable) RBE's implementation:

On Thu, Feb 24, 2022 at 8:55 AM Ed Schouten @.***> wrote:

Assume I have a build cluster that supports writes to the AC, but only to certain users/clients.

Is it permitted to return ActionCacheUpdateCapabilities to one user that has update_enabled set to true for some of them, while returning false to ones that are unauthorized? Or should access controls have no influence?

Yes, for both this and other fields, it is possible to return different values per-user. The Capabilities capture what a particular user can do on a particular instance on a particular server.

Nit: What is the default value of update_enabled, in case action_cache_update_capabilities in ActionCacheUpdateCapabilities is not set? I assume false? Why isn't this boolean embedded into the parent directly?

Default value should be false. Presumably we put the boolean in a separate message to allow for future expansion. Looking at it today, that was a mistake--the parent message could have accommodated the expansion directly and we would have had a simpler API.

Does the same hold for ExecutionCapabilities's exec_enabled field?

What does it mean if this field is not set? That it's unknown whether execution is supported?

Default value of exec_enabled should be false. That could indicate that execution is not supported by the server (i.e., it only supports local execution + remote cache), or that the calling user does not have permissions to execute actions on the server.

Assume GetCapabilities() is called against an instance name that does not exist. Should an error be returned? If so, which one? NOT_FOUND or INVALID_ARGUMENT? Or is it allowed to succeed, returning a ServerCapabilities that has both execution_capabilities and cache_capabilities left unset?

The semantics here are surprisingly complicated. Depending on the situation, the error should be one of PERMISSION_DENIED, NOT_FOUND, or INVALID_ARGUMENT, being careful not to leak the existence of an instance that the user otherwise would not know about. Thinking through this quickly:

If the user specifies an instance, and the server doesn't support the concept of instances at all, return INVALID_ARGUMENT.
If the user does not specify an instance and the server requires one, return INVALID_ARGUMENT.
If the user specifies an instance, but they do not have permission to access it (see below, we reuse the blob read permission), then whether the server should return PERMISSION_DENIED or NOT_FOUND depends on whether the user has permission to see the instance (even if they don't have permission to access it). I think this boils down to whether the concept of hierarchy is supported in instance names (e.g., a conceptual "folder" that can contain multiple instances). If so, and the user has permission to see the instance, then return PERMISSION_DENIED. If the user doesn't have permission to see the instance, or hierarchy is not supported, then return NOT_FOUND but with a message like "Instance foo does not exist, or you don't have permission to access it.

What if the user is not authorized to interact with the instance name? Should we return PERMISSION_DENIED errors? This is in my opinion somewhat inconsistent with exec_enabled and update_enabled, as based on the previous questions those may also be used to convey authorization related decisions.

This is truly awkward, probably because we tried to cram permissions for three services into one Capabilities API. We reuse the permission to read blobs as the permission to call GetCapabilities, on the theory that Remote Execution isn't very useful if you can't read blobs (and that many other permissions would effectively leak blobs if you didn't have that permission). Definitely an imperfect solution.

Long story short, I have the feeling that the semantics of the GetCapabilities() call are underspecified.

— Reply to this email directly, view it on GitHub https://github.com/bazelbuild/remote-apis/issues/215, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADMU237XRGJU3GYWOZNL6ULU4Y2GBANCNFSM5PHOMT5Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

Feb 24 '22 14:02 bergsieker

I suspect that we may be able to make things clearer by embedding google.rpc.Status messages in there. For example, instead of having an update_enabled field that is a boolean, we could return a google.rpc.Status that indicates why a client won't be able to make AC updates. The same with execution: execution_capabilities could be placed in a oneof that gives a reason why remote execution can't be used.

Another thing I've observed to be impractical is that symlink_absolute_path_strategy is part of CacheCapabilities. You could argue that for a pure data store, it's irrelevant whether symlinks can contain absolute paths. This is more a limitation of remote execution, thereby making ExecutionCapabilities a better fit.

What are your thoughts on leaving this issue open, but rebranding it to be a "V3 idea" ticket?

Feb 28 '22 11:02 EdSchouten

I'm all for leaving this open and fixing in v3. I suspect we'll either eliminate the Capabilities API entirely in v3, or significantly overhaul it and possibly split it by associated service--the API as-written has never really lived up to its promise.

In terms of specifics (for future reference), I'm not wild about reusing google.rpc.Status within the response. That seems like we'd be reusing Status because it's convenient, not because it's a great fit. If we want more granular errors, I think we should define them ourselves.

I haven't given much thought to symlink_absolute_path_strategy. I think there's an inherent linkage between CacheCapabilities and ExecutionCapabilities. Possibly the right position, API-wise, is to include it in both. But let's deal with that in v3.

Feb 28 '22 15:02 bergsieker

remote-apis remote-apis copied to clipboard

How does GetCapabilities() interact with authorization & unknown instance names?

remote-apis
remote-apis copied to clipboard