dvc icon indicating copy to clipboard operation
dvc copied to clipboard

`data status --not-in-remote`: Shows clean status even when data not pushed

Open dberenbaum opened this issue 1 year ago • 0 comments

Bug Report

Description

When adding new data and forgetting to push to a cloud-versioned remote, dvc data status --not-in-remote still reports a clean status.

Reproduce

$ git clone [email protected]:iterative/iterative.ai.git
Cloning into 'iterative.ai'...
remote: Enumerating objects: 11463, done.
remote: Counting objects: 100% (238/238), done.
remote: Compressing objects: 100% (141/141), done.
remote: Total 11463 (delta 143), reused 164 (delta 93), pack-reused 11225
Receiving objects: 100% (11463/11463), 34.83 MiB | 12.87 MiB/s, done.
Resolving deltas: 100% (6722/6722), done.

$ cd iterative.ai

$ dvc pull
Collecting                                                    |822 [00:00, 16.0kentry/s]
Fetching
Building workspace index                                      |1.00 [00:00, 17.1entry/s]
Comparing indexes                                             |824 [00:00, 8.82kentry/s]
Applying changes                                               |670 [00:00, 5.62kfile/s]
A       static/uploads/
1 file added and 670 files fetched

$ echo foo > static/uploads/foo.txt

$ dvc add static/uploads
100% Adding...|████████████████████████████████████████████████|1/1 [00:00,  1.36file/s]

$ git commit -am "add foo without pushing"
[main 11d47fd] add foo without pushing
 1 file changed, 3 insertions(+)

$ dvc data status --not-in-remote
No changes.

Expected

dvc data status --not-in-remote should recognize that foo has not been pushed.

Environment information

Output of dvc doctor:

$ dvc doctor
DVC version: 3.40.2.dev2+g71afff6db
-----------------------------------
Platform: Python 3.11.7 on macOS-14.2.1-arm64-arm-64bit
Subprojects:
        dvc_data = 3.7.0
        dvc_objects = 3.0.6
        dvc_render = 1.0.1.dev2+gcf7bcec.d20240119
        dvc_task = 0.3.0
        scmrepo = 2.0.3
Supports:
        azure (adlfs = 2023.12.0, knack = 0.11.0, azure-identity = 1.15.0),
        gdrive (pydrive2 = 1.19.0),
        gs (gcsfs = 2023.12.2.post1),
        hdfs (fsspec = 2023.12.2, pyarrow = 14.0.2),
        http (aiohttp = 3.9.1, aiohttp-retry = 2.8.3),
        https (aiohttp = 3.9.1, aiohttp-retry = 2.8.3),
        oss (ossfs = 2023.12.0),
        s3 (s3fs = 2023.12.2, boto3 = 1.33.13),
        ssh (sshfs = 2023.10.0),
        webdav (webdav4 = 0.9.8),
        webdavs (webdav4 = 0.9.8),
        webhdfs (fsspec = 2023.12.2)
Config:
        Global: /Users/dave/Library/Application Support/dvc
        System: /Library/Application Support/dvc
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk3s1s1
Caches: local
Remotes: s3
Workspace directory: apfs on /dev/disk3s1s1
Repo: dvc, git
Repo.site_cache_dir: /Library/Caches/dvc/repo/626665f97faf0086be51914a80de72b1

Additional Information (if any):

Related slack discussion

dberenbaum avatar Jan 22 '24 21:01 dberenbaum