datalad icon indicating copy to clipboard operation
datalad copied to clipboard

Should save commit updated index?

Open yarikoptic opened this issue 3 years ago • 4 comments

use-case -- update subdataset state without having it installed. datalad save seems to ignore the fact that index was updated and thus does not do any commit:

(git-annex)lena:/tmp/openneuro[master]
$> git status; echo '160000 797debeb3bc86400a8bbe3c3d4641285f6fb00c9 0  ds000001' | git update-index --index-info; git diff --cached                
On branch master
Your branch is up to date with 'origin/master'.

nothing to commit, working tree clean
diff --git a/ds000001 b/ds000001
index f8e27ac..797debe 160000
--- a/ds000001
+++ b/ds000001
@@ -1 +1 @@
-Subproject commit f8e27ac909e50b5b5e311f6be271f0b1757ebb7b
+Subproject commit 797debeb3bc86400a8bbe3c3d4641285f6fb00c9

$> datalad save -m "saving updated index"


$> git status
On branch master
Your branch is up to date with 'origin/master'.

Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
	modified:   ds000001

I guess the answer could be: save cares only about the current tree state (and here is there is no submodule installed) and ignores index, and thus should do nothing.

yarikoptic avatar Aug 26 '22 14:08 yarikoptic

@yarikoptic What if you do mkdir ds000001 before datalad save? Does that change anything? (I ask because, in my experiments, Git treated the submodule as unstaged-deleted after update-index if its directory didn't exist.)

jwodder avatar Aug 26 '22 15:08 jwodder

Just try and see. datalad install ///openneuro to get initial situation

yarikoptic avatar Aug 26 '22 16:08 yarikoptic

I tried the following, but datalad save didn't like it:

  1. Create a Git repo foo
  2. datalad create bar; cd bar
  3. Set the contents of .gitmodules to:
[submodule "foo"]
        path = foo
        url = ../foo
  1. printf '160000 commit e3ad37b0a2dc63f7f35e4971bb40ee0f39aa8a09\tfoo\n' | git update-index --index-info
  2. mkdir foo
  3. datalad save -m Test

Output from datalad save:

[WARNING] Received an exception CommandError(CommandError: 'git -c diff.ignoreSubmodules=none commit -m Test -- .gitmodules foo' failed with exitcode 128 under /Users/jwodder/work/dev/tmp/foo/bar [err: 'error: 'foo' does not have a commit checked out
fatal: updating files failed']). Canceling not-yet running jobs and waiting for completion of running. You can force earlier forceful exit by Ctrl-C.
[INFO   ] Canceled 0 out of 0 jobs. 0 left running.
Total: 0.00 datasets [00:00, ? datasets/s]CommandError: 'git -c diff.ignoreSubmodules=none commit -m Test -- .gitmodules foo' failed with exitcode 128 under /Users/jwodder/work/dev/tmp/foo/bar
error: 'foo' does not have a commit checked out
fatal: updating files failed

Note that doing git commit -m Test instead works just fine.

jwodder avatar Aug 26 '22 17:08 jwodder

yeah, because foo is a new (not yet known) element to git committed state, datalad tries on operate on it and provides foo as argument to commit. As I have mentioned in https://github.com/dandi/dandisets/issues/256#issuecomment-1228589957, to update an already known subdataset we must not provide its path to commit and just stage that change in index. The situation you @jwodder outlined emphasizes the fact that datalad at large ignores index and considers only what it sees on the drive (a new foo as it is now listed in .gitmodules I guess)

yarikoptic avatar Aug 26 '22 18:08 yarikoptic

let's consider superseded by more detailed https://github.com/datalad/datalad/issues/7074

yarikoptic avatar Oct 07 '22 21:10 yarikoptic