dvc icon indicating copy to clipboard operation
dvc copied to clipboard

Negation "!" in .dvcignore doesn't unignore

Open Eve-ning opened this issue 2 years ago • 5 comments

Bug Report

Issue name

Negation "!", commonly used in .gitignore to "unignore" files, doesn't work in DVC.

Description

Given 2 files:

  • ignore.txt
  • no-ignore.txt

We can git ignore them using this .gitignore file

ignore.txt
!no-ignore.txt

However, in DVC, it doesn't "unignore".

Reproduce

Output shown in comments

touch ignore.txt no-ignore.txt
echo -e 'ignore.txt\n!no-ignore.txt' > .dvcignore
echo -e 'ignore.txt\n!no-ignore.txt' > .gitignore
cat .dvcignore
# ignore.txt
# !no-ignore.txt
git check-ignore ignore.txt no-ignore.txt
# ignore.txt
dvc check-ignore ignore.txt no-ignore.txt
# ignore.txt
# no-ignore.txt

Expected

Ideally, both check-ignores should be the same

Environment information

Output of dvc doctor:

$ dvc doctor
DVC version: 3.30.1 (pip)
-------------------------
Platform: Python 3.10.12 on Linux-5.15.133.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
Subprojects:
        dvc_data = 2.22.0
        dvc_objects = 1.2.0
        dvc_render = 0.6.0
        dvc_task = 0.3.0
        scmrepo = 1.4.1
Supports:
        gs (gcsfs = 2023.10.0),
        http (aiohttp = 3.9.0, aiohttp-retry = 2.8.3),
        https (aiohttp = 3.9.0, aiohttp-retry = 2.8.3)
Config:
        Global: /home/jc/.config/dvc
        System: /etc/xdg/dvc
Cache types: hardlink, symlink
Cache directory: 9p on drvfs
Caches: local
Remotes: gs
Workspace directory: 9p on drvfs
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/67bc00a31a88271f0f2653ea5494f098

Additional Information (if any):

Eve-ning avatar Nov 29 '23 08:11 Eve-ning

@Eve-ning I think dvcignore feature ie the patterns is working fine, the bug is in dvc check-ignore command. dvc add unignores the files marked with !. You can try the following.

file: .dvcignore
to_ignore*
!to_ignore-NOT.txt

$ mkdir data
$ cat '1' > data/to_ignore.txt
$ cat '1' > data/to_ignore-NOT.txt

$ dvc add data

$ tree .dvc/cache/files/md5

└── c1
    └── ba58b05f6245f221ad65391fa6690b    <<-- md5 for data/to_ignore-NOT.txt is added to cache

$ md5 data/*

MD5 (data/to_ignore-NOT.txt) = c1ba58b05f6245f221ad65391fa6690b
MD5 (data/to_ignore.txt) = 919d117956d3135c4c683ff021352f5c

It looks like this is the expected behaviour of check-ignore. this is the test for the behaviour.

@dberenbaum Do you think this test is correct, or am I missing something here? Tagging you since the last activity on this ticket was yours.

anunayasri avatar Oct 07 '24 11:10 anunayasri

Yes, I think that is correct. It looks related to a previous issue in https://github.com/iterative/dvc/issues/5046. DVC uses both of these methods and they don't seem to always be consistent:

https://github.com/iterative/dvc/blob/9b5772fab8ad6ca7e885c97d094043b6ac2e34a9/dvc/ignore.py#L395-L409

https://github.com/iterative/dvc/blob/9b5772fab8ad6ca7e885c97d094043b6ac2e34a9/dvc/ignore.py#L411-L424

dberenbaum avatar Oct 08 '24 14:10 dberenbaum

git check-ignore and dvc check-ignore behaves differently for the same ignore patterns.

For .dvcignore and .gitignore -

data/data1
to_ignore*
!to_ignore-NOT.txt

Following outputs differ -

dvc-demo-2 on  master [!?] via 🐍 v3.11.4 (.venv)
❯ git check-ignore data/*
data/data1
data/to_ignore.txt

dvc-demo-2 on  master [!?] via 🐍 v3.11.4 (.venv)
❯ dvc check-ignore data/*
== I am LOCAL DVC ==
data/data1
data/to_ignore-NOT.txt
data/to_ignore.txt

@dberenbaum This definitely seems like a bug. I believe the test cases are wrong and needs to be updated. Can the team confirm?

anunayasri avatar Oct 08 '24 16:10 anunayasri

@anunayasri good catch. Yes, I think DVC check-ignore should behave the same way as git.

I believe the test cases are wrong and needs to be updated. Can the team confirm?

only test case, or the implementation as well?

shcheklein avatar Oct 10 '24 02:10 shcheklein

@shcheklein Of course, we have to update both the test case and the implementation. By test case I meant that the expected behaviour is misaligned.
I working on another issue. Will try to fix this post that.

anunayasri avatar Oct 10 '24 05:10 anunayasri