dvc icon indicating copy to clipboard operation
dvc copied to clipboard

dvc auth check

Open dberenbaum opened this issue 3 years ago • 8 comments

@dberenbaum why don't we introduce a command in dvc for the check availability of buckets/directories? like dvc auth check s3://mybucket / dvc credentials azure://mycontainer

From @dmpetrov

dberenbaum avatar Sep 02 '22 14:09 dberenbaum

Related: #6927. Maybe it could be an option to pass arbitrary information to dvc remote test to also test arbitrary credentials for things like get-url/import-url.

dberenbaum avatar Sep 02 '22 14:09 dberenbaum

I did a quick look into it:

  • AWS: https://stackoverflow.com/questions/31836816/how-to-test-credentials-for-aws-command-line-tools (seems doable)
  • Azure: https://stackoverflow.com/questions/53258301/how-to-verify-authentication-of-microsoft-azure-storage-accounts-when-called-wit (doesn't seem straightforward)
  • GCloud: https://cloud.google.com/sdk/gcloud/reference/auth/list (haven't tested this but looks promising)

Even if we have a credentials check, the operations may fail due to permissions to list/read/write. So is it better to have a credentials check or a --dry-run type option? Would it have to start the operation and then interrupt?

dberenbaum avatar Sep 02 '22 14:09 dberenbaum

My understanding here is that fsspec does not have a good way to check for authentication issues. It may raise OSError on auth issues, but lots of api have assumptions that when OSError/IOError is raised, it is considered as FileNotFoundError instead (eg: isdir, info, walk, etc).

skshetry avatar Sep 02 '22 14:09 skshetry

If the check is cheap to perform, I think would be also nice to include the check as part of dvc doctor.

Similar to how the Caches displays a link to the Troubleshooting page, we could do the same for the Remotes section.

daavoo avatar Sep 16 '22 11:09 daavoo

@skshetry We are not limited by fsspec there, we can add any auth checks that we please on top and then consider propagading into upstream. But overall auth errors are handled differently depending on filesystem, in some it is better than in others. The OSError you've mentioned is just a bug, we've discussed this before.

What's important to remember here is that auth check is not going to be a silver bullet, as with many operations we won't know if we have access untill we try to do them. For example, s3 ls might work with some prefixes but not with others and so on, there are a lot of variables there. The best we can reasonably do is probably try listing a specified path, but even that is pretty limited.

efiop avatar Sep 16 '22 11:09 efiop

The OSError you've mentioned is just a bug, we've discussed this before.

I'll consider this as a https://github.com/iterative/dvc/labels/feature%20request rather than a https://github.com/iterative/dvc/labels/bug.

What's important to remember here is that auth check is not going to be a silver bullet

I agree, I don't think the command helps, we should try to improve on error messages and exceptions. I'll suggest closing this in favour of #6353.

skshetry avatar Sep 19 '22 09:09 skshetry

I agree, I don't think the command helps, we should try to improve on error messages and exceptions. I'll suggest closing this in favour of https://github.com/iterative/dvc/issues/6353.

The way I interpreted the request is that the command is not intended to be a silver bullet but rather a quick way of detecting errors regarding credentials and/or remote setup. Would be nice to clarify the use case for the command

daavoo avatar Sep 19 '22 09:09 daavoo

See https://github.com/iterative/studio/issues/4105#issuecomment-1234993440

dberenbaum avatar Sep 19 '22 14:09 dberenbaum

The way I interpreted the request is that the command is not intended to be a silver bullet

An error could happen due to lot of things. Ideally, dvc should be able to inform users that the auth failed, in the commands where relevant, there should not be a need for a separate command.

skshetry avatar Nov 07 '22 14:11 skshetry

Something like dvc remote check would be a very handy thing to have (e.g. on studio side as well). Something like

# dvc remote check
mys3remote: Unable to locate credentials
myazureremote: Auth failed
mylocalremote: Does not exist

(just from the top of my head, it should be something simple and informative). It doesn't have to be perfect, even calling fs.exists on remote path should be super informative. This should be a slim, simple and straightforward tool. Maybe we could also add some useful hints there, but that might risk bloating it, we should not try to create a space ship out of it (maybe some opt-in flags to the rescue, dunno).

MIght get to this in the near future for studio.

efiop avatar Aug 16 '23 22:08 efiop