mlem icon indicating copy to clipboard operation
mlem copied to clipboard

init: handle auth errors (remote locations)

Open jorgeorpinel opened this issue 2 years ago • 8 comments

If you try this (from https://mlem.ai/doc/user-guide/remote-objects#cloud-remotes)

$ mlem init s3://example-mlem-get-started
❌ Unexpected error: Connect timeout on endpoint URL: "http://169.254.169.254/latest/api/token"

The error message isn't informative. I assume it's an auth issue since I don't have access to bucket example-mlem-get-started. The error handling could be better here, I think.

jorgeorpinel avatar Jul 14 '22 16:07 jorgeorpinel

p.s. there's no verbose output option right? (like dvc ... -v)

jorgeorpinel avatar Jul 14 '22 16:07 jorgeorpinel

There is mlem --tb ... option. As for the error itself, we use fsspec and it seems it does not wrap this exception into something nice. And it will be very hard to wrap each call with every possible error (it's probably connection error for s3 but might be something else for azure/gcp/ect). Btw I think I usually get Access Denied or something like this from s3, so please go ahead and post full traceback here

mike0sv avatar Jul 14 '22 18:07 mike0sv

There is mlem --tb ... option

Should we consider changing that to --verbose (-v) for consistency? (with dvc + think it's pretty standard).

very hard to wrap each call with every possible error

Even a general error message that's more friendly and informative would be better. Ultimately it's a UX/ quality question.

please go ahead and post full traceback here

mlem --tb init s3://example-mlem-get-started
Traceback (most recent call last):
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/aiohttp/connector.py", line 986, in _wrap_create_connection
    return await self._loop.create_connection(*args, **kwargs)  # type: ignore[return-value]  # noqa
  File "/opt/homebrew/Cellar/[email protected]/3.9.13_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/asyncio/base_events.py", line 1050, in create_connection
    sock = await self._connect_sock(
  File "/opt/homebrew/Cellar/[email protected]/3.9.13_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/asyncio/base_events.py", line 961, in _connect_sock
    await self.sock_connect(sock, address)
  File "/opt/homebrew/Cellar/[email protected]/3.9.13_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/asyncio/selector_events.py", line 500, in sock_connect
    return await fut
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/aiohttp/client.py", line 535, in _request
    conn = await self._connector.connect(
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/aiohttp/connector.py", line 542, in connect
    proto = await self._create_connection(req, traces, timeout)
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/aiohttp/connector.py", line 907, in _create_connection
    _, proto = await self._create_direct_connection(req, traces, timeout)
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/aiohttp/connector.py", line 1175, in _create_direct_connection
    transp, proto = await self._wrap_create_connection(
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/aiohttp/connector.py", line 986, in _wrap_create_connection
    return await self._loop.create_connection(*args, **kwargs)  # type: ignore[return-value]  # noqa
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/async_timeout/__init__.py", line 129, in __aexit__
    self._do_exit(exc_type)
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/async_timeout/__init__.py", line 212, in _do_exit
    raise asyncio.TimeoutError
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/aiobotocore/httpsession.py", line 178, in send
    response = await self._session.request(
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/aiohttp/client.py", line 539, in _request
    raise ServerTimeoutError(
aiohttp.client_exceptions.ServerTimeoutError: Connection timeout to host http://169.254.169.254/latest/api/token

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/jop/MLEM-repos/test/.venv/bin/mlem", line 8, in <module>
    sys.exit(app())
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/typer/main.py", line 214, in __call__
    return get_command(self)(*args, **kwargs)
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/typer/main.py", line 500, in wrapper
    return callback(**use_params)  # type: ignore
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/mlem/cli/main.py", line 278, in inner
    res = f(*iargs, **ikwargs) or {}
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/mlem/cli/init.py", line 19, in init
    init(path)
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/mlem/api/commands.py", line 214, in init
    if fs.exists(path):
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/fsspec/asyn.py", line 86, in wrapper
    return sync(self.loop, func, *args, **kwargs)
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/fsspec/asyn.py", line 66, in sync
    raise return_result
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/fsspec/asyn.py", line 26, in _runner
    result[0] = await coro
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/s3fs/core.py", line 888, in _exists
    await self._info(path, bucket, key, version_id=version_id)
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/s3fs/core.py", line 1140, in _info
    out = await self._call_s3(
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/s3fs/core.py", line 325, in _call_s3
    await self.set_session()
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/s3fs/core.py", line 473, in set_session
    self._s3 = await s3creator.__aenter__()
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/aiobotocore/session.py", line 22, in __aenter__
    self._client = await self._coro
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/aiobotocore/session.py", line 102, in _create_client
    credentials = await self.get_credentials()
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/aiobotocore/session.py", line 133, in get_credentials
    self._credentials = await (self._components.get_component(
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/aiobotocore/credentials.py", line 814, in load_credentials
    creds = await provider.load()
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/aiobotocore/credentials.py", line 486, in load
    metadata = await fetcher.retrieve_iam_role_credentials()
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/aiobotocore/utils.py", line 175, in retrieve_iam_role_credentials
    token = await self._fetch_metadata_token()
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/aiobotocore/utils.py", line 88, in _fetch_metadata_token
    response = await session.send(request.prepare())
  File "/Users/jop/MLEM-repos/test/.venv/lib/python3.9/site-packages/aiobotocore/httpsession.py", line 210, in send
    raise ConnectTimeoutError(endpoint_url=request.url, error=e)
botocore.exceptions.ConnectTimeoutError: Connect timeout on endpoint URL: "http://169.254.169.254/latest/api/token"

jorgeorpinel avatar Jul 15 '22 01:07 jorgeorpinel

  1. For unknown bucket I get:
$ mlem init s3://example-mlem-get-starteddsda
███╗   ███╗██╗     ███████╗███╗   ███╗
████╗ ████║██║     ██╔════╝████╗ ████║
██╔████╔██║██║     █████╗  ██╔████╔██║
██║╚██╔╝██║██║     ██╔══╝  ██║╚██╔╝██║
██║ ╚═╝ ██║███████╗███████╗██║ ╚═╝ ██║
╚═╝     ╚═╝╚══════╝╚══════╝╚═╝     ╚═╝

┌───────────────────────────────────────────────────────────────────┐
│                                                                   │
│       MLEM has enabled anonymous aggregate usage analytics.       │
│    Read the analytics documentation (and how to opt-out) here:    │
│            <https://mlem.ai/docs/user-guide/analytics>            │
│                                                                   │
└───────────────────────────────────────────────────────────────────┘
❌ Unexpected error: The specified bucket does not exist
Please report it here: <https://github.com/iterative/mlem/issues>

So there are some meaningful errors. Would be nice to wrap them in a MLEM Exception.

  1. For making -v behave like --tb we have this issue https://github.com/iterative/mlem/issues/409. We even had a PR there, but it was abandoned.

aguschin avatar Nov 07 '22 07:11 aguschin

We can wrap all the Location methods in try/except. But that's not enough, since we use fs methods internally all the time and any one of those calls can fail with any error (in this case it was botocore.exceptions.ConnectTimeoutError for example). We need to think of some elegant solution, otherwise we will add a couple of dozens ugly snippets like

try:
   fs.something
except Exception:
   raise MlemError

mike0sv avatar Nov 07 '22 14:11 mike0sv

Good point. As an alternative, we may just stop printing

❌ Unexpected error: The specified bucket does not exist
Please report it here: <https://github.com/iterative/mlem/issues>

and instead print something like

❌ MlemFsspecError: The specified bucket does not exist

I assume this could be fixed in a single place - in https://github.com/iterative/mlem/blob/main/mlem/cli/main.py#L426

This won't change API though, but maybe returning specific fsspec Exceptions is ok, or even better than returning some general MLEM Exception.

aguschin avatar Nov 07 '22 15:11 aguschin

It's not the problem to raise some specific exception. It's a problem to catch errors from fsspec because there is no specific exception that we can catch. Generally fs implementations try to use generic FileNotFoundError and similar, but they also can throw anything else. Eg this botocore exception I mentioned earlier is subclasses from IOError.

mike0sv avatar Nov 08 '22 10:11 mike0sv

So the only way is to narrow try/except blocks to every fsspec call so we can be sure that it's fsspec call that broke. And that brings us to my previous comment

mike0sv avatar Nov 08 '22 10:11 mike0sv