Concurrent `zkg create` operations can race on update of default package source
While working on zeek/package-template#4 I noticed that multiple concurrent zkg create operations can race on update of the default package source. I did not check which other operations can trigger such an update.
I would have expected that an operation which does not obviously mutate zkg state does properly protect that state, e.g., with file locks.
Example:
# Save as `Makefile` in some directory.
# Run as many concurrent jobs as there are create operations.
MAKEFLAGS=-j6
all: 1.pkg 2.pkg 3.pkg 4.pkg 5.pkg 6.pkg
%.pkg:
zkg create --packagedir=$@ --user-var name=Name --force
$ make -B
zkg create --packagedir=4.pkg --user-var name=Name --force
zkg create --packagedir=5.pkg --user-var name=Name --force
zkg create --packagedir=6.pkg --user-var name=Name --force
Traceback (most recent call last):
File "/Users/bbannier/.local/pipx/venvs/zkg/lib/python3.10/site-packages/zeekpkg/_util.py", line 60, in make_symlink
os.symlink(target_path, link_path)
FileExistsError: [Errno 17] File exists: 'packages.zeek' -> '/Users/bbannier/.zkg/script_dir/packages/__load__.zeek'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/bbannier/.local/bin/zkg", line 2770, in <module>
main()
File "/Users/bbannier/.local/bin/zkg", line 2765, in main
manager = create_manager(args, config)
File "/Users/bbannier/.local/bin/zkg", line 387, in create_manager
manager = zeekpkg.Manager(state_dir=state_dir, script_dir=script_dir,
File "/Users/bbannier/.local/pipx/venvs/zkg/lib/python3.10/site-packages/zeekpkg/manager.py", line 359, in __init__
make_symlink('packages.zeek', self.autoload_package)
$ make -B
rning: skipped using package source named "zeek": failed to clone git repo
make: *** [5.pkg] Error 1
warning: skipped using package source named "zeek": failed to clone git repo
"package-template" will use value of "name" (the name of the package, e.g. "FooBar") from command line: Name
"package-template" will use value of "name" (the name of the package, e.g. "FooBar") from command line: Name
"package-template" will use value of "name" (the name of the package, e.g. "FooBar") from command line: Name
"package-template" will use value of "name" (the name of the package, e.g. "FooBar") from command line: Name
❯ make -B
zkg create --packagedir=1.pkg --user-var name=Name --force
zkg create --packagedir=2.pkg --user-var name=Name --force
zkg create --packagedir=3.pkg --user-var name=Name --force
zkg create --packagedir=4.pkg --user-var name=Name --force
zkg create --packagedir=5.pkg --user-var name=Name --force
zkg create --packagedir=6.pkg --user-var name=Name --force
warning: skipped using package source named "zeek": failed to clone git repo
warning: skipped using package source named "zeek": failed to clone git repo
warning: skipped using package source named "zeek": failed to clone git repo
As a workaround for my use case I am using a zkg config file with an empty sources section.
The failed to clone git repo is around doing a checkout upon constructing the default source. Adding a bit of -vv to the reproducer provides the details:
$ make
zkg -vv create --packagedir=1.pkg --user-var name=Name --force
zkg -vv create --packagedir=2.pkg --user-var name=Name --force
zkg -vv create --packagedir=3.pkg --user-var name=Name --force
zkg -vv create --packagedir=4.pkg --user-var name=Name --force
zkg -vv create --packagedir=5.pkg --user-var name=Name --force
zkg -vv create --packagedir=6.pkg --user-var name=Name --force
2022-10-12 13:59:23 WARNING failed to clone git repo: Cmd('git') failed due to: exit code(128)
cmdline: git checkout master
stderr: 'fatal: Unable to create '/home/awelzel/.zkg/clones/source/zeek/.git/index.lock': File exists.
Another git process seems to be running in this repository, e.g.
an editor opened by 'git commit'. Please make sure all processes
are terminated then try again. If it still fails, a git process
may have crashed in this repository earlier:
remove the file manually to continue.'
warning: skipped using package source named "zeek": failed to clone git repo
2022-10-12 13:59:24 WARNING failed to clone git repo: Cmd('git') failed due to: exit code(128)
cmdline: git checkout master
stderr: 'fatal: Unable to create '/home/awelzel/.zkg/clones/source/zeek/.git/index.lock': File exists.
Another git process seems to be running in this repository, e.g.
an editor opened by 'git commit'. Please make sure all processes
are terminated then try again. If it still fails, a git process
may have crashed in this repository earlier:
remove the file manually to continue.'
warning: skipped using package source named "zeek": failed to clone git repo
I hit another one that is seemingly independent of the default source, but same class of errors, Cloning/checking out the template repo seems also racy:
2022-10-12 14:08:40 ERROR problem while loading template https://github.com/zeek/package-template: failed to checkout branch/version "v1.0.0" of template https://github.com/zeek/package-template: Cmd('git') failed due to: exit code(128)
cmdline: git checkout v1.0.0
stderr: 'fatal: Unable to create '/home/awelzel/.zkg/clones/template/package-template/.git/index.lock': File exists.
Another git process seems to be running in this repository, e.g.
an editor opened by 'git commit'. Please make sure all processes
are terminated then try again. If it still fails, a git process
may have crashed in this repository earlier:
remove the file manually to continue.'
Traceback (most recent call last):
File "/home/awelzel/.local/lib/python3.9/site-packages/zeekpkg/template.py", line 146, in load
git_checkout(repo, version)
File "/home/awelzel/.local/lib/python3.9/site-packages/zeekpkg/_util.py", line 134, in git_checkout
clone.git.checkout(version)
File "/home/awelzel/.local/lib/python3.9/site-packages/git/cmd.py", line 639, in <lambda>
return lambda *args, **kwargs: self._call_process(name, *args, **kwargs)
File "/home/awelzel/.local/lib/python3.9/site-packages/git/cmd.py", line 1184, in _call_process
return self.execute(call, **exec_kwargs)
File "/home/awelzel/.local/lib/python3.9/site-packages/git/cmd.py", line 984, in execute
raise GitCommandError(redacted_command, status, stderr_value, stdout_value)
git.exc.GitCommandError: Cmd('git') failed due to: exit code(128)
cmdline: git checkout v1.0.0
stderr: 'fatal: Unable to create '/home/awelzel/.zkg/clones/template/package-template/.git/index.lock': File exists.
I did not check which other operations can trigger such an update.
zkg list seems to trigger the two you showed as well:
$ make
zkg -vv list
zkg -vv list
zkg -vv list
zkg -vv list
zkg -vv list
zkg -vv list
Traceback (most recent call last):
File "/home/awelzel/.local/lib/python3.9/site-packages/zeekpkg/_util.py", line 60, in make_symlink
os.symlink(target_path, link_path)
FileExistsError: [Errno 17] File exists: 'packages.zeek' -> '/home/awelzel/.zkg/script_dir/packages/__load__.zeek'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/awelzel/.local/bin/zkg", line 2770, in <module>
main()
File "/home/awelzel/.local/bin/zkg", line 2765, in main
manager = create_manager(args, config)
File "/home/awelzel/.local/bin/zkg", line 387, in create_manager
manager = zeekpkg.Manager(state_dir=state_dir, script_dir=script_dir,
File "/home/awelzel/.local/lib/python3.9/site-packages/zeekpkg/manager.py", line 359, in __init__
make_symlink('packages.zeek', self.autoload_package)
File "/home/awelzel/.local/lib/python3.9/site-packages/zeekpkg/_util.py", line 63, in make_symlink
os.remove(link_path)
make: *** [Makefile:9: 3.pkg] Error 1
make: *** Waiting for unfinished jobs....
2022-10-12 14:18:08 WARNING failed to clone git repo: Cmd('git') failed due to: exit code(128)
cmdline: git checkout master
stderr: 'fatal: Unable to create '/home/awelzel/.zkg/clones/source/zeek/.git/index.lock': File exists.
Another git process seems to be running in this repository, e.g.
an editor opened by 'git commit'. Please make sure all processes
are terminated then try again. If it still fails, a git process
may have crashed in this repository earlier:
remove the file manually to continue.'
warning: skipped using package source named "zeek": failed to clone git repo
Yeah I think this is just not at all part of the model atm, and clearly should be.
There's also the fact that ...
I would have expected that an operation which does not obviously mutate zkg state does properly protect that state
... covers surprisingly few operations because virtually everything starts with zkg refreshing state, potentially establishing directories, etc.