datalad
datalad copied to clipboard
problematic effect of "git mv" of a submodule - not renamed
$> p=/tmp/testds; rm -rf $p; datalad create $p; cd $p; datalad create -d . subm1 && datalad save && git mv subm1 subm1-moved && datalad create -d . subm1
[INFO ] Creating a new annex repo at /tmp/testds
create(ok): /tmp/testds (dataset)
[INFO ] Creating a new annex repo at /tmp/testds/subm1
create(ok): subm1 (dataset)
add(ok): subm1 (file)
add(ok): .gitmodules (file)
save(ok): . (dataset)
action summary:
add (ok: 2)
save (ok: 1)
[ERROR ] collision with content in parent dataset at /tmp/testds: ['/tmp/testds/subm1'] [create(/tmp/testds/subm1)]
$> ls -l
total 4
drwx------ 4 yoh yoh 4096 May 30 08:55 subm1-moved/
$> git submodule
b096a2f0558c817767872b89e888957463d9d5f3 subm1-moved (heads/master)
$> cat .gitmodules
[submodule "subm1"]
path = subm1-moved
url = ./subm1
datalad-id = 3dbc6d14-82da-11e9-8069-8019340ce7f2
$> datalad subdatasets
subdataset(ok): subm1-moved (dataset)
$> git version
git version 2.21.0.593.g511ec345e18
I wonder if that is something we should seek fixed in git -- I expect both submodule name and url being adjusted by git mv
?
0.11.x works out without crash but information about moved one is gone:
(git)hopa:~datalad/datalad[0.11.x]git-annex
$> p=/tmp/testds; rm -rf $p; datalad create $p; cd $p; datalad create -d . subm1 && datalad save && git mv subm1 subm1-moved && datalad create -d . subm1 [INFO ] Creating a new annex repo at /tmp/testds
create(ok): /tmp/testds (dataset)
[INFO ] Creating a new annex repo at /tmp/testds/subm1
create(ok): subm1 (dataset)
action summary:
add (notneeded: 2, ok: 1)
create (ok: 1)
save (ok: 1)
[INFO ] Creating a new annex repo at /tmp/testds/subm1
create(ok): subm1 (dataset)
action summary:
add (notneeded: 2, ok: 1)
create (ok: 1)
save (ok: 1)
(dev3) 1 12517.....................................:Thu 30 May 2019 09:00:39 AM EDT:.
(git-annex)hopa:/tmp/testds[master]git
$> cat .gitmodules
[submodule "subm1"]
path = subm1
url = ./subm1
(dev3) 1 12518.....................................:Thu 30 May 2019 09:00:44 AM EDT:.
(git-annex)hopa:/tmp/testds[master]git
$> git status
On branch master
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
new file: subm1-moved
(git-annex)hopa:/tmp/testds[master]git
$> git commit -m new
[master 90aed10] new
1 file changed, 1 insertion(+)
create mode 160000 subm1-moved
(dev3) 1 12520.....................................:Thu 30 May 2019 09:01:58 AM EDT:.
(git-annex)hopa:/tmp/testds[master]git
$> cat .gitmodules
[submodule "subm1"]
path = subm1
url = ./subm1
(dev3) 1 12521.....................................:Thu 30 May 2019 09:02:01 AM EDT:.
(git-annex)hopa:/tmp/testds[master]git
$> git status
On branch master
nothing to commit, working tree clean
(dev3) 1 12522.....................................:Thu 30 May 2019 09:02:16 AM EDT:.
(git-annex)hopa:/tmp/testds[master]git
$> git submodule
316bc1be035fec6e2d8b1d7a91c9f4da8821a09f subm1 (heads/master)
fatal: no submodule mapping found in .gitmodules for path 'subm1-moved'
actually, if I do save after git mv
on master version, it also works out fine just the same problematic result and this time git submodule just "forgets" about new one
$> p=/tmp/testds; rm -rf $p; datalad create $p; cd $p; datalad create -d . subm1 && datalad save && git mv subm1 subm1-moved && datalad save -m moved && datalad create -d . subm1 [INFO ] Creating a new annex repo at /tmp/testds
create(ok): /tmp/testds (dataset)
[INFO ] Creating a new annex repo at /tmp/testds/subm1
create(ok): subm1 (dataset)
add(ok): subm1 (file)
add(ok): .gitmodules (file)
save(ok): . (dataset)
action summary:
add (ok: 2)
save (ok: 1)
save(ok): . (dataset)
[INFO ] Creating a new annex repo at /tmp/testds/subm1
create(ok): subm1 (dataset)
(dev3) 1 12530.....................................:Thu 30 May 2019 09:03:16 AM EDT:.
(git-annex)hopa:/tmp/testds[master]git
$> cat .gitmodules
[submodule "subm1"]
path = subm1-moved
url = ./subm1
datalad-id = 47b13182-82db-11e9-8069-8019340ce7f2
(dev3) 1 12531.....................................:Thu 30 May 2019 09:03:18 AM EDT:.
(git-annex)hopa:/tmp/testds[master]git
$> git submodule
5fcc9e437e7254a64db2a5d16e2dfea83198c886 subm1-moved (heads/master)
(dev3) 1 12532.....................................:Thu 30 May 2019 09:03:23 AM EDT:.
(git-annex)hopa:/tmp/testds[master]git
$> ls -l
total 8
drwx------ 4 yoh yoh 4096 May 30 09:03 subm1/
drwx------ 4 yoh yoh 4096 May 30 09:03 subm1-moved/
(dev3) 1 12533.....................................:Thu 30 May 2019 09:03:32 AM EDT:.
(git-annex)hopa:/tmp/testds[master]git
$> datalad subdatasets
subdataset(ok): subm1-moved (dataset)
I expect both submodule name and url being adjusted by
git mv
FWIW I didn't expect either of these because
-
the name is inferred from the path if --name isn't specified, but there isn't an inherent coupling between the path and name
-
the url isn't tied to the local repository's state in the typical non-datalad case. Even when relative paths are given, they're usually taken as relative to some remote (upstream if configured or "origin"). It's only when that doesn't exist that the relative path is considered to be relative to the current working directory. Even if Git determined that your configured state was using the current directory, this isn't necessarily true for other people's clones, so I don't think it'd want to update the tracked .gitmodules file.
sounds like we might be doomed to introduce datalad rename
or datalad mv
to facilitate our common use case(s).
Echo chamber ;-) https://github.com/datalad/datalad/issues/1193
As for the need for the command - yes. But original issue was for use case to move files between datasets.
FWIW -- remains pertinent in 2021
lena:/tmp
$> p=/tmp/testds; rm -rf $p; datalad create $p; cd $p; datalad create -d . subm1 && datalad save && git mv subm1 subm1-moved && datalad create -d . subm1
[INFO ] Creating a new annex repo at /tmp/testds
create(ok): /tmp/testds (dataset)
[INFO ] Creating a new annex repo at /tmp/testds/subm1
add(ok): subm1 (file)
add(ok): .gitmodules (file)
save(ok): . (dataset)
create(ok): subm1 (dataset)
action summary:
add (ok: 2)
create (ok: 1)
save (ok: 1)
create(error): subm1 (dataset) [collision with /tmp/testds/subm1 (dataset) in dataset /tmp/testds]
$> datalad --version
datalad 0.15.3
and in 2023 too:
(fdm-werkstatt) adina@muninn in /tmp
❱ p=/tmp/testds; rm -rf $p; datalad create $p; cd $p; datalad create -d . subm1 && datalad save && git mv subm1 subm1-moved && datalad create -d . subm1
[WARNING] Requested extension 'next' is not available
create(ok): /tmp/testds (dataset)
[WARNING] Requested extension 'next' is not available
add(ok): subm1 (dataset)
add(ok): .gitmodules (file)
save(ok): . (dataset)
create(ok): subm1 (dataset)
action summary:
add (ok: 2)
create (ok: 1)
save (ok: 1)
[WARNING] Requested extension 'next' is not available
[WARNING] Requested extension 'next' is not available
create(error): subm1 (dataset) [collision with /tmp/testds/subm1 (dataset) in dataset /tmp/testds]
(fdm-werkstatt) adina@muninn in /tmp/testds on git:master+
❱ datalad --version 1 !
[WARNING] Requested extension 'next' is not available
datalad 0.19.0