rose icon indicating copy to clipboard operation
rose copied to clipboard

Git file install: support for file creation from Git repositories

Open benfitzpatrick opened this issue 1 year ago • 9 comments

Work in progress!

Aims to address #1419.

I'm not that happy with the configuration syntax, although I am happy with the three main elements of it as elements.

The mechanism itself uses filtering and either partial clones or proper sparse checkouts where the Git version allows.

benfitzpatrick avatar Jan 17 '24 13:01 benfitzpatrick

Let us know when you want us to have a review at this.

wxtim avatar Jan 18 '24 08:01 wxtim

Ready for review, not ready to go in IMO!

benfitzpatrick avatar Jan 23 '24 13:01 benfitzpatrick

Ready for review, not ready to go in IMO!

@benfitzpatrick, what's left to do?

oliver-sanders avatar Feb 26 '24 10:02 oliver-sanders

I'd like to do a bit of in-anger user-case testing if we feel it's close enough as-is? Shouldn't take too long.

benfitzpatrick avatar Feb 26 '24 10:02 benfitzpatrick

I've done some testing against real cases I could find in our workflows, found and fixed some issues in afddb2b.

I tested a case with 6 GitHub file installs in a single rose-app.conf - we were worried about hitting a rate limit. It worked fine and looked like these happen sequentially, at least from the verbose output? Happy to demo offline.

Otherwise I'm happy

benfitzpatrick avatar Mar 14 '24 16:03 benfitzpatrick

looked like these happen sequentially

Curious, I'll check that's because GitHub is serving the requests sequentially rather than Rose failing to run them in parallel.

oliver-sanders avatar Mar 15 '24 16:03 oliver-sanders

Little bump

benfitzpatrick avatar Apr 10 '24 13:04 benfitzpatrick

looked like these happen sequentially

Curious, I'll check that's because GitHub is serving the requests sequentially rather than Rose failing to run them in parallel.

After a bit of digging I'm happy that these operations are being run concurrently, resulting in their subprocesses being run in parallel.

To track subprocesses, try the following diff:

diff --git a/metomi/rose/popen.py b/metomi/rose/popen.py
index 8993fb93..8ec00c94 100644
--- a/metomi/rose/popen.py
+++ b/metomi/rose/popen.py
@@ -372,6 +372,7 @@ class RosePopener:
                 proc = Popen(args[0], **kwargs)
             else:
                 command = ' '.join(map(shlex.quote, args))
+                print(f'# {command}')  # process started
                 proc = await asyncio.create_subprocess_shell(command, **kwargs)
         except OSError as exc:
             if exc.filename is None and args:
@@ -392,7 +393,9 @@ class RosePopener:
         if isinstance(kwargs.get("stdin"), str):
             stdin = kwargs.get("stdin")
         stdout, stderr = await proc.communicate(stdin)
+        command = ' '.join(map(shlex.quote, args))
         await proc.wait()
+        print(f'$ {command}')  # process returned
         return proc.returncode, stdout, stderr
 
     async def run_ok_async(self, *args, **kwargs):

Using this diff with the following config:

[command]
default=true

[file:cylc/scheduler.py]
source=git:[email protected]:cylc/cylc-flow.git::cylc/flow/scheduler.py::master

[file:cylc/task_proxy.py]
source=git:https://github.com/cylc/cylc-flow.git::cylc/flow/task_proxy.py::master

[file:cylc/async_util.py]
source=git:https://github.com/cylc/cylc-flow.git::cylc/flow/async_util.py::master

I get this output showing up to three subprocesses running simultaneously:

# git --git-dir=/var/tmp/tmpqqkj_tvk/.git init
# git --git-dir=/var/tmp/tmpd4c71k8x/.git init
# git --git-dir=/var/tmp/tmp50f2vgwd/.git init
$ git --git-dir=/var/tmp/tmpqqkj_tvk/.git init
# git --git-dir=/var/tmp/tmpqqkj_tvk/.git remote add origin https://github.com/cylc/cylc-flow.git
$ git --git-dir=/var/tmp/tmp50f2vgwd/.git init
# git --git-dir=/var/tmp/tmp50f2vgwd/.git remote add origin https://github.com/cylc/cylc-flow.git
$ git --git-dir=/var/tmp/tmpd4c71k8x/.git init
# git --git-dir=/var/tmp/tmpd4c71k8x/.git remote add origin [email protected]:cylc/cylc-flow.git
$ git --git-dir=/var/tmp/tmpqqkj_tvk/.git remote add origin https://github.com/cylc/cylc-flow.git
# git --git-dir=/var/tmp/tmpqqkj_tvk/.git fetch --depth=1 origin 4b70574c9100e1cf830a226c254cb3159a558873
$ git --git-dir=/var/tmp/tmp50f2vgwd/.git remote add origin https://github.com/cylc/cylc-flow.git
# git --git-dir=/var/tmp/tmp50f2vgwd/.git fetch --depth=1 origin 4b70574c9100e1cf830a226c254cb3159a558873
$ git --git-dir=/var/tmp/tmpd4c71k8x/.git remote add origin [email protected]:cylc/cylc-flow.git
# git --git-dir=/var/tmp/tmpd4c71k8x/.git fetch --depth=1 origin 4b70574c9100e1cf830a226c254cb3159a558873
$ git --git-dir=/var/tmp/tmpqqkj_tvk/.git fetch --depth=1 origin 4b70574c9100e1cf830a226c254cb3159a558873
# git --git-dir=/var/tmp/tmpqqkj_tvk/.git --work-tree=/var/tmp/tmpqqkj_tvk checkout 4b70574c9100e1cf830a226c254cb3159a558873
$ git --git-dir=/var/tmp/tmp50f2vgwd/.git fetch --depth=1 origin 4b70574c9100e1cf830a226c254cb3159a558873
# git --git-dir=/var/tmp/tmp50f2vgwd/.git --work-tree=/var/tmp/tmp50f2vgwd checkout 4b70574c9100e1cf830a226c254cb3159a558873
$ git --git-dir=/var/tmp/tmp50f2vgwd/.git --work-tree=/var/tmp/tmp50f2vgwd checkout 4b70574c9100e1cf830a226c254cb3159a558873
# rsync -a '--exclude=.*' --timeout=1800 '--rsh=ssh -oBatchMode=yes -oStrictHostKeyChecking=no -oConnectTimeout=8' /var/tmp/tmp50f2vgwd/cylc/flow/async_util.py /var/tmp/tmph9lc16hm/f13f5dacdb0f632826821bf9278a10fc
$ git --git-dir=/var/tmp/tmpqqkj_tvk/.git --work-tree=/var/tmp/tmpqqkj_tvk checkout 4b70574c9100e1cf830a226c254cb3159a558873
# rsync -a '--exclude=.*' --timeout=1800 '--rsh=ssh -oBatchMode=yes -oStrictHostKeyChecking=no -oConnectTimeout=8' /var/tmp/tmpqqkj_tvk/cylc/flow/task_proxy.py /var/tmp/tmph9lc16hm/759c658589e37a9af232bfa8e03d01b0
$ rsync -a '--exclude=.*' --timeout=1800 '--rsh=ssh -oBatchMode=yes -oStrictHostKeyChecking=no -oConnectTimeout=8' /var/tmp/tmpqqkj_tvk/cylc/flow/task_proxy.py /var/tmp/tmph9lc16hm/759c658589e37a9af232bfa8e03d01b0
$ rsync -a '--exclude=.*' --timeout=1800 '--rsh=ssh -oBatchMode=yes -oStrictHostKeyChecking=no -oConnectTimeout=8' /var/tmp/tmp50f2vgwd/cylc/flow/async_util.py /var/tmp/tmph9lc16hm/f13f5dacdb0f632826821bf9278a10fc
$ git --git-dir=/var/tmp/tmpd4c71k8x/.git fetch --depth=1 origin 4b70574c9100e1cf830a226c254cb3159a558873
# git --git-dir=/var/tmp/tmpd4c71k8x/.git --work-tree=/var/tmp/tmpd4c71k8x checkout 4b70574c9100e1cf830a226c254cb3159a558873
$ git --git-dir=/var/tmp/tmpd4c71k8x/.git --work-tree=/var/tmp/tmpd4c71k8x checkout 4b70574c9100e1cf830a226c254cb3159a558873
# rsync -a '--exclude=.*' --timeout=1800 '--rsh=ssh -oBatchMode=yes -oStrictHostKeyChecking=no -oConnectTimeout=8' /var/tmp/tmpd4c71k8x/cylc/flow/scheduler.py /var/tmp/tmph9lc16hm/8a367e85378d4334d5133fa065d71116
$ rsync -a '--exclude=.*' --timeout=1800 '--rsh=ssh -oBatchMode=yes -oStrictHostKeyChecking=no -oConnectTimeout=8' /var/tmp/tmpd4c71k8x/cylc/flow/scheduler.py /var/tmp/tmph9lc16hm/8a367e85378d4334d5133fa065d71116

You can also test it like this:

diff --git a/metomi/rose/popen.py b/metomi/rose/popen.py
index 8993fb93..35bdbe27 100644
--- a/metomi/rose/popen.py
+++ b/metomi/rose/popen.py
@@ -371,7 +371,8 @@ class RosePopener:
             if kwargs.get("shell"):
                 proc = Popen(args[0], **kwargs)
             else:
-                command = ' '.join(map(shlex.quote, args))
+                # command = ' '.join(map(shlex.quote, args))
+                command = 'sleep 5'
                 proc = await asyncio.create_subprocess_shell(command, **kwargs)
         except OSError as exc:
             if exc.filename is None and args:

The time taken should increase linearly as the sleep is increased, but only see a small increase (parsing overheads) when more files are configured for installation.

oliver-sanders avatar Apr 12 '24 09:04 oliver-sanders

Curious, I'll check that's because GitHub is serving the requests sequentially rather than Rose failing to run them in parallel.

After a bit of digging I'm happy that these operations are being run concurrently, resulting in their subprocesses being run in parallel.

Ah, I know why I thought they were sequential - the parsing is serial, the actual pulling can be concurrent. Sorry-thanks!

benfitzpatrick avatar Apr 15 '24 16:04 benfitzpatrick

I've opened #2784 - it looks like the bug I found is unrelated to this work.

wxtim avatar Jun 06 '24 12:06 wxtim

Whilst working on #2785, I've noticed that "cannot connect to" type errors appear to surface with the following message:

ConfigProcessError: file:README.md=source=git:[email protected]:metomi/rose::README.md::master: ls-remote: could not find ref 'master' in '[email protected]:metomi/rose'

Where the underlying cause here is that GitHub SSH access has not been configured.

oliver-sanders avatar Jun 13 '24 12:06 oliver-sanders

I could do a "or remote not contactable?" on the end of the error message?

benfitzpatrick avatar Jun 14 '24 08:06 benfitzpatrick

If it's possible to differentiate between these cases, great, but if not that'll be fine.

oliver-sanders avatar Jun 14 '24 09:06 oliver-sanders

Waiting for tests to pass again, then ready to merge.

oliver-sanders avatar Jun 18 '24 09:06 oliver-sanders

Good news, the tests pass.

Bad news, if some of the tests skip, then the test fails, e.g:

diff --git a/t/rose-app-run/28-git.t b/t/rose-app-run/28-git.t
index 1b4eaaa90..92a411611 100644
--- a/t/rose-app-run/28-git.t
+++ b/t/rose-app-run/28-git.t
@@ -72,12 +72,12 @@ remote_locations=("$HOSTNAME:$TEST_DIR/hellorepo/" "http://localhost:$GIT_WS_POR
for i in 0 1 2; do
     remote_mode="${remote_test_modes[$i]}"
     remote="${remote_locations[$i]}"
-    if [[ "$remote_mode" == "ssh" ]] && ! ssh -n -q -oBatchMode=yes $HOSTNAME true 1>'/dev/null' 2>/dev/null; then
+    if true; then
         skip 14 "cannot ssh to localhost $HOSTNAME"
         echo "Skip $remote" >/dev/tty        
         continue
     fi
-    if [[ "$remote_mode" == "http" ]] && ! curl --head --silent --fail $remote >/dev/null 2>&1; then
+    if true; then
         skip 14 "failed to launch http on localhost"
         echo "Skip $remote" >/dev/tty        
         continue

:(

[FAIL] ../config: rose-app.conf not found.
Dubious, test returned 1 (wstat 256, 0x100)
Failed 15/57 subtests 
	(less 42 skipped subtests: 0 okay)
Test Summary Report
-------------------
t/rose-app-run/28-git.t (Wstat: 256 Tests: 42 Failed: 0)
  Non-zero exit status: 1
  Parse errors: Bad plan.  You planned 57 tests but ran 42.

Not a release blocker. Suggest bumping this to future work.

~Opened: https://github.com/metomi/rose/issues/2789~

oliver-sanders avatar Jun 18 '24 10:06 oliver-sanders

@MetRonnie, could you check the last couple of commits and merge if you're happy with the tests being pushed to a bugfix release milestone.

oliver-sanders avatar Jun 18 '24 10:06 oliver-sanders

Brill, that's fixed it (will close issue). Tested with all possible skip combinations, test passed each time.

oliver-sanders avatar Jun 18 '24 10:06 oliver-sanders

(Mac OS tests failing due to gh-actions DNS issues, safe to ignore)

oliver-sanders avatar Jun 18 '24 10:06 oliver-sanders

(the test fixes would have been on a bugfix release milestone, however, there is no need for a post-fix, Ben has sorted it on this PR)

oliver-sanders avatar Jun 18 '24 10:06 oliver-sanders