dvc icon indicating copy to clipboard operation
dvc copied to clipboard

push: Infinite loop in credential configuration detected

Open nisace opened this issue 1 year ago • 13 comments

Bug Report

Description

I am using dvc with an AWS S3 remote. The credentials to access to S3 are handled by aws-vault.

When using dvc 2.5.4, everything works fine. However, with 2.6.x or newer, I get the following error when I try to dvc push:

ERROR: unexpected error - Infinite loop in credential configuration detected. Attempting to load from profile my-profile-a which has already been visited. Visited profiles: ['my-profile-b', 'my-profile-a']

The error appears if:

  • I try to dvc push after having updated a file inside a dvc-tracked folder.

However, I can dvc push normally if:

  • I add a new file to a dvc-tracked folder.
  • I add a new dvc-tracked file.
  • I update a dvc-tracked file.

My aws-vault credentials configuration is as follows:

[profile my-profile-a]
region=eu-west-3
s3.multipart_threshold = 5GB
credential_process=aws-vault exec --no-session --json my-profile-a
s3=
  multipart_threshold = 5GB

[profile my-profile-b]
region=us-west-1
source_profile = my-profile-a
role_arn = arn:aws:iam::123456789012:role/MyRole
role_session_name = my.name
s3=
  multipart_threshold = 5GB

The dvc config is as follows:

[core]
    remote = s3remote
['remote "s3remote"']
    url = s3://my-bucket
    profile = my-profile-b

A possible explanation is that dvc calls the botocore function that checks the credentials more than once. At first I thought it to be related to this botocore issue but as the error appears only when trying to dvc push a folder after an update to one of its files, the issue seems to be related to dvc.

Reproduce

  • Set aws-vault with a similar configuration
  • Set dvc with a similar configuration
  • Add and push a folder to dvc
  • Update an existing file inside this folder
  • Try to dvc push the folder

Expected

dvc push should be possible.

Environment information

Output of dvc doctor:

$ dvc doctor
DVC version: 3.30.1 (pip)
-------------------------
Platform: Python 3.10.10 on Linux-5.15.0-89-generic-x86_64-with-glibc2.31
Subprojects:
	dvc_data = 2.22.0
	dvc_objects = 1.2.0
	dvc_render = 0.6.0
	dvc_task = 0.3.0
	scmrepo = 1.5.0
Supports:
	http (aiohttp = 3.9.0, aiohttp-retry = 2.8.3),
	https (aiohttp = 3.9.0, aiohttp-retry = 2.8.3),
	s3 (s3fs = 2023.10.0, boto3 = 1.28.64)
Config:
    Global: /home/myname/.config/dvc
    System: /etc/xdg/xdg-ubuntu/dvc
Cache types: hardlink, symlink
Cache directory: ext4 on /dev/mapper/vgubuntu-root
Caches: local
Remotes: s3
Workspace directory: ext4 on /dev/mapper/vgubuntu-root
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/8767037a1b741787f352cb10a11ee443

Additional Information (if any):

$ dvc push -v
2023-12-07 16:52:31,294 DEBUG: v3.30.1 (pip), CPython 3.10.10 on Linux-5.15.0-89-generic-x86_64-with-glibc2.31
2023-12-07 16:52:31,294 DEBUG: command: /home/myname/work/tmp/test-dvc-versions/.venv/bin/dvc push -v
2023-12-07 16:52:31,618 DEBUG: Preparing to transfer data from '/home/myname/work/tmp/test-dvc-versions/.dvc/cache/files/md5' to 's3://my-bucket/myname/test-dvc-versions/files/md5'
2023-12-07 16:52:31,619 DEBUG: Preparing to collect status from 'my-bucket/myname/test-dvc-versions/files/md5'           
2023-12-07 16:52:31,619 DEBUG: Collecting status from 'my-bucket/myname/test-dvc-versions/files/md5'                     
2023-12-07 16:52:31,622 DEBUG: Querying 2 oids via object_exists                                                                         
2023-12-07 16:52:33,528 ERROR: unexpected error - Infinite loop in credential configuration detected. Attempting to load from profile my-profile-a which has already been visited. Visited profiles: ['my-profile-b', 'my-profile-a']                                              
Traceback (most recent call last):
  File "/home/myname/work/tmp/test-dvc-versions/.venv/lib/python3.10/site-packages/dvc/cli/__init__.py", line 211, in main
    ret = cmd.do_run()
  File "/home/myname/work/tmp/test-dvc-versions/.venv/lib/python3.10/site-packages/dvc/cli/command.py", line 27, in do_run
    return self.run()
  File "/home/myname/work/tmp/test-dvc-versions/.venv/lib/python3.10/site-packages/dvc/commands/data_sync.py", line 64, in run
    processed_files_count = self.repo.push(
  File "/home/myname/work/tmp/test-dvc-versions/.venv/lib/python3.10/site-packages/dvc/repo/__init__.py", line 60, in wrapper
    return f(repo, *args, **kwargs)
  File "/home/myname/work/tmp/test-dvc-versions/.venv/lib/python3.10/site-packages/dvc/repo/push.py", line 117, in push
    push_transferred, push_failed = ipush(
  File "/home/myname/work/tmp/test-dvc-versions/.venv/lib/python3.10/site-packages/dvc_data/index/push.py", line 68, in push
    result = transfer(
  File "/home/myname/work/tmp/test-dvc-versions/.venv/lib/python3.10/site-packages/dvc_data/hashfile/transfer.py", line 204, in transfer
    status = compare_status(
  File "/home/myname/work/tmp/test-dvc-versions/.venv/lib/python3.10/site-packages/dvc_data/hashfile/status.py", line 176, in compare_status
    dest_exists, dest_missing = status(
  File "/home/myname/work/tmp/test-dvc-versions/.venv/lib/python3.10/site-packages/dvc_data/hashfile/status.py", line 136, in status
    exists = hashes.intersection(
  File "/home/myname/work/tmp/test-dvc-versions/.venv/lib/python3.10/site-packages/dvc_data/hashfile/status.py", line 45, in _indexed_dir_hashes
    indexed_dir_exists.update(hashes)
  File "/home/myname/work/tmp/test-dvc-versions/.venv/lib/python3.10/site-packages/tqdm/std.py", line 1182, in __iter__
    for obj in iterable:
  File "/home/myname/work/tmp/test-dvc-versions/.venv/lib/python3.10/site-packages/dvc_objects/db.py", line 359, in list_oids_exists
    in_remote = self.fs.exists(paths, batch_size=jobs)
  File "/home/myname/work/tmp/test-dvc-versions/.venv/lib/python3.10/site-packages/dvc_objects/fs/base.py", line 371, in exists
    return fut.result()
  File "/home/myname/.pyenv/versions/3.10.10/lib/python3.10/concurrent/futures/_base.py", line 458, in result
    return self.__get_result()
  File "/home/myname/.pyenv/versions/3.10.10/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/home/myname/work/tmp/test-dvc-versions/.venv/lib/python3.10/site-packages/dvc_objects/executors.py", line 134, in batch_coros
    result = fut.result()
  File "/home/myname/work/tmp/test-dvc-versions/.venv/lib/python3.10/site-packages/s3fs/core.py", line 1035, in _exists
    await self._info(path, bucket, key, version_id=version_id)
  File "/home/myname/work/tmp/test-dvc-versions/.venv/lib/python3.10/site-packages/s3fs/core.py", line 1302, in _info
    out = await self._call_s3(
  File "/home/myname/work/tmp/test-dvc-versions/.venv/lib/python3.10/site-packages/s3fs/core.py", line 341, in _call_s3
    await self.set_session()
  File "/home/myname/work/tmp/test-dvc-versions/.venv/lib/python3.10/site-packages/s3fs/core.py", line 522, in set_session
    self._s3 = await s3creator.get_client()
  File "/home/myname/work/tmp/test-dvc-versions/.venv/lib/python3.10/site-packages/s3fs/utils.py", line 72, in get_client
    self._client = await self._stack.enter_async_context(
  File "/home/myname/.pyenv/versions/3.10.10/lib/python3.10/contextlib.py", line 619, in enter_async_context
    result = await _cm_type.__aenter__(cm)
  File "/home/myname/work/tmp/test-dvc-versions/.venv/lib/python3.10/site-packages/aiobotocore/session.py", line 27, in __aenter__
    self._client = await self._coro
  File "/home/myname/work/tmp/test-dvc-versions/.venv/lib/python3.10/site-packages/aiobotocore/session.py", line 173, in _create_client
    credentials = await self.get_credentials()
  File "/home/myname/work/tmp/test-dvc-versions/.venv/lib/python3.10/site-packages/aiobotocore/session.py", line 83, in get_credentials
    self._credentials = await (
  File "/home/myname/work/tmp/test-dvc-versions/.venv/lib/python3.10/site-packages/aiobotocore/credentials.py", line 960, in load_credentials
    creds = await provider.load()
  File "/home/myname/work/tmp/test-dvc-versions/.venv/lib/python3.10/site-packages/aiobotocore/credentials.py", line 689, in load
    return await self._load_creds_via_assume_role(self._profile_name)
  File "/home/myname/work/tmp/test-dvc-versions/.venv/lib/python3.10/site-packages/aiobotocore/credentials.py", line 692, in _load_creds_via_assume_role
    role_config = self._get_role_config(profile_name)
  File "/home/myname/work/tmp/test-dvc-versions/.venv/lib/python3.10/site-packages/botocore/credentials.py", line 1562, in _get_role_config
    self._validate_source_profile(profile_name, source_profile)
  File "/home/myname/work/tmp/test-dvc-versions/.venv/lib/python3.10/site-packages/botocore/credentials.py", line 1613, in _validate_source_profile
    raise InfiniteLoopConfigError(
botocore.exceptions.InfiniteLoopConfigError: Infinite loop in credential configuration detected. Attempting to load from profile my-profile-a which has already been visited. Visited profiles: ['my-profile-b', 'my-profile-a']

2023-12-07 16:52:33,621 DEBUG: link type reflink is not available ([Errno 95] no more link types left to try out)
2023-12-07 16:52:33,622 DEBUG: Removing '/home/myname/work/tmp/.iKooz7Cn9gqk4S6XK6HrH2.tmp'
2023-12-07 16:52:33,622 DEBUG: Removing '/home/myname/work/tmp/.iKooz7Cn9gqk4S6XK6HrH2.tmp'
2023-12-07 16:52:33,623 DEBUG: Removing '/home/myname/work/tmp/.iKooz7Cn9gqk4S6XK6HrH2.tmp'
2023-12-07 16:52:33,624 DEBUG: Removing '/home/myname/work/tmp/test-dvc-versions/.dvc/cache/files/md5/.R58Taoujw2rofT9vbPZxxm.tmp'
2023-12-07 16:52:33,646 DEBUG: Version info for developers:
DVC version: 3.30.1 (pip)
-------------------------
Platform: Python 3.10.10 on Linux-5.15.0-89-generic-x86_64-with-glibc2.31
Subprojects:
	dvc_data = 2.22.0
	dvc_objects = 1.2.0
	dvc_render = 0.6.0
	dvc_task = 0.3.0
	scmrepo = 1.5.0
Supports:
	http (aiohttp = 3.9.0, aiohttp-retry = 2.8.3),
	https (aiohttp = 3.9.0, aiohttp-retry = 2.8.3),
	s3 (s3fs = 2023.10.0, boto3 = 1.28.64)
Config:
	Global: /home/myname/.config/dvc
	System: /etc/xdg/xdg-ubuntu/dvc
Cache types: hardlink, symlink
Cache directory: ext4 on /dev/mapper/vgubuntu-root
Caches: local
Remotes: s3
Workspace directory: ext4 on /dev/mapper/vgubuntu-root
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/8767037a1b741787f352cb10a11ee443

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
2023-12-07 16:52:33,649 DEBUG: Analytics is enabled.
2023-12-07 16:52:33,716 DEBUG: Trying to spawn ['daemon', 'analytics', '/tmp/tmpfw6iwkr3', '-v']
2023-12-07 16:52:33,722 DEBUG: Spawned ['daemon', 'analytics', '/tmp/tmpfw6iwkr3', '-v'] with pid 29059

nisace avatar Dec 07 '23 18:12 nisace

@nisace Thanks for the investigation! From a quick look it indeed looks like https://github.com/boto/botocore/issues/1780 and I don't really see dvc itself doing anything funky here.

but as the error appears only when trying to dvc push a folder after an update to one of its files, the issue seems to be related to dvc.

So fresh push in a new remote doesn't produce this error? It is important to note that dvc only tries to upload missing files, so a push-after-update would trigger it unlike just push (not fresh one), since the latter would be a noop and would not. Just in case there is some confusion around it.

efiop avatar Dec 11 '23 13:12 efiop

@efiop thanks for you response ! Yes, fresh push does not produce the error. I've tried again and here is the behavior:

  • dvc add a new folder: ok
  • dvc push the folder: ok
  • add a new file to the folder and dvc add: ok
  • dvc status: ok (Data and pipelines are up to date)
  • dvc status -c: ok, produces the following output
        new:                data/folder/file-2.md                                                                                        
	new:                data/folder
  • dvc push the folder: ok
  • dvc push again without any modification: the error appears
  • dvc status: ok (Data and pipelines are up to date)
  • dvc status -c: the error appears

Thus, in fact, there is no need to update a file in the folder to make the error appear.

nisace avatar Dec 11 '23 17:12 nisace

As it seems to be related to the cache, I've just tried to remove it:

  • rm -rf .dvc/cache
  • dvc status -c: ok
  • dvc status -c: ok
  • dvc pull: ok
  • dvc pull: ok
  • dvc status -c: the error appears

The same thing with only one dvc pull

  • rm -rf .dvc/cache
  • dvc status -c: ok
  • dvc status -c: ok
  • dvc pull: ok
  • dvc status -c: the error appears

nisace avatar Dec 11 '23 17:12 nisace

@efiop did you have a chance to look at the issue? Let me know if you need more information. Thanks !

nisace avatar Dec 18 '23 11:12 nisace

@nisace Sorry, didn't have time yet. But also doesn't look dvc related, but rather boto related, so not sure how much I can really do here.

efiop avatar Dec 18 '23 14:12 efiop

@efiop, I've looked at it again and noticed that dvc status -c always hit the raise of InfiniteLoopConfigError at this line of botocore (both with or without a dvc cache). However, the expection is effectively raised (meaning the dvc status -c command breaks) only when a dvc cache is present. Thus it seems that the exception get caught somewhere. But I haven't found where yet. Do you have any clue? Thanks !

nisace avatar Jan 11 '24 14:01 nisace

Here is what I've found so far:

nisace avatar Jan 11 '24 16:01 nisace

Hi @efiop, do you agree with the behavior describe above and does it look right to you? Let me know if you need additional information. Thanks !

nisace avatar Jan 24 '24 15:01 nisace

@nisace Sorry, I still didn't have time to look into this. But regarding pull, it is meant to catch errors and fetch whatever it can and then checkout whatever it can and checkout is where previously occured errors might show up as missing files. I'm not sure how this cold cause "infinite loop in credential" error though.

efiop avatar Jan 24 '24 15:01 efiop

@efiop, the following code raises the InfiniteLoopConfigError error unless batch_size is set to batch_size=1 or unless there is 0 or 1 item in path. Thus, there seems to be a multi-threading issue.

Am I missing something? Hope it helps, thanks !

from dvc_s3 import S3FileSystem


s3_file_system = S3FileSystem(
    host="my-bucket",
    profile="my-profile-b",
)

result = s3_file_system.exists(
    path=[
        "my-bucket/key_0",
        "my-bucket/key_1",
    ],
    batch_size=None,
)

nisace avatar Jan 31 '24 15:01 nisace

Does it work if you add one exists call before?

from dvc_s3 import S3FileSystem

fs = S3FileSystem(host="my-bucket", profile="my-profile-b")
fs.exists("my-bucket/key_0")
result = s3_file_system.exists(
    path=[
        "my-bucket/key_0",
        "my-bucket/key_1",
    ],
    batch_size=None,
)

skshetry avatar Feb 01 '24 04:02 skshetry

Hi @skshetry, yes, adding a first exists call makes the error disappear.

nisace avatar Feb 01 '24 09:02 nisace

Hi @skshetry were you able to reproduce the error and look at it? Thanks !

nisace avatar Feb 14 '24 13:02 nisace

Hi, @nisace, could you please check if iterative/dvc-s3#67 fixed the issue for you?

You can install it as follows:

pip install "dvc-s3 @ git+https://github.com/iterative/dvc-s3.git"

skshetry avatar Mar 03 '24 01:03 skshetry

Yes it does !

nisace avatar Mar 04 '24 09:03 nisace

Thanks for the confirmation, @nisace. I have just released dvc-s3==3.1.0.

skshetry avatar Mar 04 '24 09:03 skshetry

Thanks a lot !

nisace avatar Mar 04 '24 11:03 nisace