`submodules.get()` throws a ValueError exception rather than returning None as documented.
Documentation for the get() method for SubmoduleCollection indicates that the function returns None if the submodule is not found.
However, the following sample script throws <class 'ValueError'> submodule 'second' has not been added yet.
import tempfile
from pygit2 import init_repository
# This should print "Success"
with tempfile.TemporaryDirectory() as prefix:
first_prefix = f'{prefix}/first'
second_prefix = f'{first_prefix}/second'
first = init_repository(first_prefix)
second = init_repository(second_prefix)
try:
assert first.submodules.get('second') is None
print('Success')
except Exception as e:
print(type(e), e)
By looking at the get() function code, it catches KeyError but not ValueError. Is that an expected but undocumented behavior? Or should the function also catch ValueErrors?
The previous error is maybe due to my misunderstanding of submodule handling in libgit2. It seems that a repository within a repository is considered as a submodule even when it is not explicitly added as a submodule. This does not follows the behavior of the git CLI.
Here is a sample script of the behavior of git CLI regarding to this situation:
#! /bin/bash -xe
prefix=$(mktemp -d)
cd ${prefix}
git init first
cd first
git init second
touch second/.gitkeep
git -C second add .gitkeep
git -C second commit -m "Initial commit."
echo -e "\n> 'first' has no submodules"
git submodule
echo -e "\n> 'second' is not staged"
git status
rm -rf ${prefix}
Which outputs:
Initialized empty Git repository in /tmp/tmp.B6hf4Ktp8n/first/.git/
Initialized empty Git repository in /tmp/tmp.B6hf4Ktp8n/first/second/.git/
[main (root-commit) 98c0c51] Initial commit.
1 file changed, 0 insertions(+), 0 deletions(-)
create mode 100644 .gitkeep
> 'first' has no submodules
> 'second' is not staged
On branch main
No commits yet
Untracked files:
(use "git add <file>..." to include in what will be committed)
second/
nothing added to commit but untracked files present (use "git add" to track)
Furthermore, from libgit2 documentation:
Submodule support in libgit2 builds a list of known submodules and keeps it in the repository. The list is built from the .gitmodules file, the .git/config file, the index, and the HEAD tree. Items in the working directory that look like submodules (i.e. a git repo) but are not mentioned in those places won't be tracked.
Here is an updated version of the initial script.
'second' is considered as a submodule of 'first', but it is not staged so it complains about that while listing 'first''s submodules.
Staging 'second' solves the issue and now, it is successfully listed as a 'first' submodule.
However, the .gitmodules is not created and 'first' is still dirty...
Setting the 'second' submodule url then triggers the generation of the .gitmodules file.
Committing 'second' and '.gitmodules' works but 'first' is still dirty.
import tempfile
import os
from pygit2 import init_repository, Signature
from pygit2.enums import FileStatus
# This should print "Success"
prefix = tempfile.TemporaryDirectory()
first_prefix = f'{prefix}/first'
second_prefix = f'{first_prefix}/second'
try:
first = init_repository(first_prefix, initial_head="main")
second = init_repository(second_prefix, initial_head="main")
with open(f'{second_prefix}/.gitkeep', 'w') as fd:
fd.write("")
second.index.add('.gitkeep')
tree = second.index.write_tree()
s = Signature('test', 'test')
second.create_commit("HEAD", s, s, 'Initial commit.', tree, [])
assert second.status() == {}, '"second" wt is dirty.'
assert first.status() == {'second/': FileStatus.WT_NEW}
try:
# This throws an unexpected `ValueError`
assert first.submodules.get('second') is not None
except ValueError as e:
print('Unexpected error: ', type(e), e)
# Adding 'second' to 'first' index fixes the unexpected throw and 'second'
# is considered as a submodule...
first.index.add('second')
assert first.submodules.get('second') is not None
# But .gitmodules file does not exsits...
try:
assert os.path.exists(f'{first_prefix}/.gitmodules'), '.gitmodules file does not exists...'
except AssertionError as e:
print(e)
# And 'second' is considered as 'WT_MODIFIED' while being staged...
try:
assert (status := first.status()) == {'second': FileStatus.INDEX_NEW}, status
except AssertionError as e:
print('Unexpected status: ', e)
# Modifying the submodule url triggers the .gitmodules generation
first.submodules.get('second').url = 'ssh://[email protected]:2222/second'
assert os.path.exists(f'{first_prefix}/.gitmodules'), '.gitmodules file does not exists...'
first.index.add_all()
first.index.write()
tree = first.index.write_tree()
first.create_commit('HEAD', s, s, 'Adding "second" submodule', tree, [])
assert '.gitmodules' not in first.status()
# And 'second' is still considered as 'WT_MODIFIED' ...
try:
assert (status := first.status()) == {}, status
except AssertionError as e:
print('Unexpected status: ', e)
finally:
prefix.cleanup()
This script outputs:
Unexpected error: <class 'ValueError'> submodule 'second' has not been added yet
.gitmodules file does not exists...
Unexpected status: {'second': <FileStatus.INDEX_NEW|WT_MODIFIED: 257>}
Unexpected status: {'second': <FileStatus.WT_MODIFIED: 256>}
A few additional comments:
-
Calling
second.index.write()aftersecond.index.add('.gitkeep')solves the issue of 'first' remaining dirty. -
Setting the submodule URL triggers the creation of the
.gitmodulesfile but it is lacking the local path:
[submodule "second"]
url = ssh://[email protected]:2222/second
From libgit2 documentation, there is no git_submodule_set_path() helper so we can't add a helper in pygit2 to update the submodule path afterwards.
-
Calling
first.index.add('second')will not generate the.gitmodulesfile even if the remoteoriginis correctly set. -
Calling
first.submodules.add('ssh://[email protected]':2222/second)triggers aGitError:_pygit2.GitError: the repository is not emptybut the.gitmodulesfile is correctly created:
[submodule "second"]
path = second
url = ssh://[email protected]:2222/first/second
Note that this hack does not work if the .gitmodules is already existing.
Based on these observations, which is the recommended way to add a local submodule with pygit2? Calling submodules.add() and ignore the error since the repository already exists locally? Or call index.add() then set the submodule url and path to generate the .gitmodules file, then add it? Or another method that I'm not aware of?