dvc icon indicating copy to clipboard operation
dvc copied to clipboard

dvc pull: re.error: redefinition of group name 'ps_d' as group 2; was group 1 at position 46

Open arthurkok2 opened this issue 3 years ago • 16 comments
trafficstars

Bug Report

Description

When running dvc pull, re throws an error dvc pull: re.error: redefinition of group name 'ps_d' as group 2; was group 1 at position 46

Reproduce

  1. run dvc pull

Expected

It not to throw an error

Environment information

Output of dvc doctor:

DVC version: 2.0.18 (pip)
---------------------------------
Platform: Python 3.9.12 on Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.31
Supports: http, https, s3
Cache types: <https://error.dvc.org/no-dvc-cache>
Caches: local
Remotes: s3
Workspace directory: 9p on drvfs
Repo: dvc (no_scm)

Additional Information (if any):

2022-08-31 12:34:03,876 ERROR: unexpected error - redefinition of group name 'ps_d' as group 2; was group 1 at position 46
------------------------------------------------------------
Traceback (most recent call last):
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/main.py", line 55, in main
    ret = cmd.run()
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/command/data_sync.py", line 29, in run
    stats = self.repo.pull(
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/repo/__init__.py", line 49, in wrapper
    return f(repo, *args, **kwargs)
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/repo/pull.py", line 29, in pull
    processed_files_count = self.fetch(
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/repo/__init__.py", line 49, in wrapper
    return f(repo, *args, **kwargs)
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/repo/fetch.py", line 43, in fetch
    used = self.used_cache(
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/repo/__init__.py", line 396, in used_cache
    for stage, filter_info in pairs:
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/repo/__init__.py", line 389, in <genexpr>
    self.stage.collect_granular(
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/repo/stage.py", line 397, in collect_granular
    stages, file, _ = _collect_specific_target(
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/repo/stage.py", line 91, in _collect_specific_target
    if not (recursive and loader.fs.isdir(target)):
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/fs/local.py", line 74, in isdir
    return not (use_dvcignore and self.dvcignore.is_ignored_dir(path_info))
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/funcy/objects.py", line 28, in __get__
    res = instance.__dict__[self.fget.__name__] = self.fget(instance)
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/fs/local.py", line 42, in dvcignore
    return cls(self, root)
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/ignore.py", line 196, in __init__
    self.ignores_trie_fs[root_dir] = DvcIgnorePatterns(
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/ignore.py", line 43, in __init__
    self.ignore_spec = [
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/ignore.py", line 44, in <listcomp>
    (ignore, re.compile("|".join(item[0] for item in group)))
  File "/home/arthur/.pyenv/versions/3.9.12/lib/python3.9/re.py", line 252, in compile
    return _compile(pattern, flags)
  File "/home/arthur/.pyenv/versions/3.9.12/lib/python3.9/re.py", line 304, in _compile
    p = sre_compile.compile(pattern, flags)
  File "/home/arthur/.pyenv/versions/3.9.12/lib/python3.9/sre_compile.py", line 764, in compile
    p = sre_parse.parse(p, flags)
  File "/home/arthur/.pyenv/versions/3.9.12/lib/python3.9/sre_parse.py", line 950, in parse
    p = _parse_sub(source, state, flags & SRE_FLAG_VERBOSE, 0)
  File "/home/arthur/.pyenv/versions/3.9.12/lib/python3.9/sre_parse.py", line 443, in _parse_sub
    itemsappend(_parse(source, state, verbose, nested + 1,
  File "/home/arthur/.pyenv/versions/3.9.12/lib/python3.9/sre_parse.py", line 833, in _parse
    raise source.error(err.msg, len(name) + 1) from None
re.error: redefinition of group name 'ps_d' as group 2; was group 1 at position 46
------------------------------------------------------------
2022-08-31 12:34:04,803 DEBUG: Version info for developers:
DVC version: 2.0.18 (pip)
---------------------------------
Platform: Python 3.9.12 on Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.31
Supports: http, https, s3
Cache types: <https://error.dvc.org/no-dvc-cache>
Caches: local
Remotes: s3
Workspace directory: 9p on drvfs
Repo: dvc (no_scm)

arthurkok2 avatar Aug 31 '22 16:08 arthurkok2

@arthurkok2, could you please share .dvcignore file or try removing certain items of it to see what is causing this?

skshetry avatar Aug 31 '22 16:08 skshetry

@arthurkok2, could you please share .dvcignore file or try removing certain items of it to see what is causing this?

my .dvcignore is empty except for some default comments:

# Add patterns of files dvc should ignore, which could improve
# the performance. Learn more at
# https://dvc.org/doc/user-guide/dvcignore

arthurkok2 avatar Aug 31 '22 16:08 arthurkok2

@arthurkok2 Do you still have the issue if you upgrade dvc?

rlamy avatar Aug 31 '22 17:08 rlamy

I am hitting the same error, note that dvc doctor also fails.

$ dvc pull
ERROR: unexpected error - redefinition of group name 'ps_d' as group 2; was group 1 at position 46

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
$ dvc doctor
ERROR: unexpected error - redefinition of group name 'ps_d' as group 2; was group 1 at position 46

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
$ dvc --version
2.23.0
$ uname -s -r -v -p
Linux 5.15.0-46-generic #49-Ubuntu SMP Thu Aug 4 18:03:25 UTC 2022 x86_64
$ python --version
Python 3.8.10

eric-seppanen avatar Aug 31 '22 23:08 eric-seppanen

The fact that two different people on very different versions hit this very specific error within a few hours makes me suspect that some dependency just released a broken version.

eric-seppanen avatar Aug 31 '22 23:08 eric-seppanen

This is caused by pathspec 0.10.0, which was released 2022-08-30.

pip install dvc explicitly warns about the incompatibility. I missed the warning because I didn't know that pip ignores errors, and because the output was buried deep in a docker build log.

Downgrading to pathspec 0.9.0 fixes this for me.

eric-seppanen avatar Sep 01 '22 01:09 eric-seppanen

@arthurkok2 Do you still have the issue if you upgrade dvc?

For myself, no, issues goes away when upgrading to 2.23.0. However, seems like other are reporting the issue even on this version.

arthurkok2 avatar Sep 01 '22 14:09 arthurkok2

We were facing the same issue during the dvc-pre-commit hook, executing pre-commit autoupdate before the hooks fixed it for us.

raychinov avatar Sep 02 '22 08:09 raychinov

The issue comes down from this change (1c8c980) in pathspec 0.10.0, which starts using a named group ps_d to match directory markers (e.g. /). Since we concatenate regexes here: https://github.com/iterative/dvc/blob/660c17f654f096096d3d552776706d4e463cdaaf/dvc/ignore.py#L42-L48

we end up having multiple groups with the same name, which results in the re.compile error we see.

dtrifiro avatar Sep 02 '22 10:09 dtrifiro

Also, as @eric-seppanen pointed out, this is explicitly pinned to <0.10 in both dvc and scmrepo.

dtrifiro avatar Sep 02 '22 10:09 dtrifiro

Thanks @eric-seppanen, @dtrifiro. I did not notice that. I'm lifting the https://github.com/iterative/dvc/labels/p0-critical.

skshetry avatar Sep 02 '22 16:09 skshetry

Here I am getting the same error

image

dvc version : 2.3.0

python version: 3.7.9

please suggest...

Varungarg97 avatar Sep 06 '22 11:09 Varungarg97

@Varungarg97 The workaround is to downgrade pathspec to 0.9.0.

rlamy avatar Sep 06 '22 11:09 rlamy

@rlamy Thank you working now

Varungarg97 avatar Sep 06 '22 12:09 Varungarg97

Hi @Varungarg97, what dvc version are you using?

dtrifiro avatar Sep 06 '22 12:09 dtrifiro

2.3.0

Varungarg97 avatar Sep 06 '22 14:09 Varungarg97

Also got this error but from dvc init after nix shell nixpkgs#dvc-with-remotes

❯ dvc init
ERROR: unexpected error - redefinition of group name 'ps_d' as group 2; was group 1 at position 46

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!

Datasets on  main [✘!+?] on ☁️  [email protected](europe-west4) 
❯ dvc --version
2.17.0
❯ nix flake metadata nixpkgs
Resolved URL:  github:NixOS/nixpkgs/nixos-22.11
Locked URL:    github:NixOS/nixpkgs/cbe419ed4c8f98bd82d169c321d339ea30904f1f
Description:   A collection of packages for the Nix package manager
Path:          /nix/store/d2flirhsd337gm8j8rxlqklslryx6g3q-source
Revision:      cbe419ed4c8f98bd82d169c321d339ea30904f1f
Last modified: 2022-12-20 09:36:45

carlthome avatar Dec 23 '22 14:12 carlthome

got the same issue described above with the standard dvc package on nixos too.

Same logs and same dvc version as above.

dvc pull also isn't working with the same error

semaraugusto avatar Jan 04 '23 20:01 semaraugusto