shards icon indicating copy to clipboard operation
shards copied to clipboard

Race Condition Between Dependencies and Their Subdependencies on Update

Open miry opened this issue 6 months ago • 14 comments

I believe I’m experiencing a related issue.

My project’s shard.yml includes:

dependencies:
  marten:
    github: miry/marten
    branch: master

  pg:
    github: will/crystal-pg
    commit: cafe112f2847f366262460ee999e74f9c7e8b31c   # unreleased 0.29.0

Meanwhile, the marten project declares pg as a development dependency:

development_dependencies:
  pg:
    github: will/crystal-pg

NOTE: There’s a difference in the commit — my project pins pg to a specific commit, while marten does not. When I run:

$ shard update marten
...
Using pg (0.29.0 at cafe112)

It seems like lib/pg is replaced with the released version from GitHub (0.29.0), but the shard.lock file still references the specific commit (cafe112...). Additionally, .shards.info shows a mismatch — it points to the latest release commit instead of the frozen one.

Here’s the diff:

   pg:
     git: https://github.com/will/crystal-pg.git
-    version: 0.29.0+git.commit.cafe112f2847f366262460ee999e74f9c7e8b31c
+    version: 0.29.0

Workaround

In case someone else runs into the same issue — the fix for lib/pg is simply to run:

$ shards update pg

PS: To help debug this, I started adding more context to log messages: https://github.com/crystal-lang/shards/pull/674/files

Reproducing steps
  1. Setup a new project with marten
$ mkdir shards_pg
$ cd shards_pg
$ shards init
  1. Update shards.yml:
name: shards_pg
version: 0.1.0
dependencies:
  marten:
    github: miry/marten
    commit: 1a40270aec0cdaa83e0c376a428ed45f49234364 # previous commit

  pg:
    github: will/crystal-pg
    commit: cafe112f2847f366262460ee999e74f9c7e8b31c # unreleased 0.29.0
  1. Install packages:
$ shards install
$ shards prune
  1. Verify the installed versions
$ cat shard.lock | grep pg -A 2    
  pg:
    git: https://github.com/will/crystal-pg.git
    version: 0.29.0+git.commit.cafe112f2847f366262460ee999e74f9c7e8b31c

$ head -n 5 lib/pg/CHANGELOG
v? upcoming
=====================
* Add support for `BigDecimal` via `PG::Numeric#to_big_d` (thanks @jgaskins)
* ~2x to ~10x+ speedup when sending large query parameters (thanks @compumike)
* Add support for COPY (thanks @17dec)

marten:

 $ cat shard.lock | grep marten -A 2
  marten:
    git: https://github.com/miry/marten.git
    version: 0.5.6+git.commit.1a40270aec0cdaa83e0c376a428ed45f49234364
  1. Update shards.yml: Remove the marten pin commit to trigger update
name: shards_pg
version: 0.1.0
dependencies:
  marten:
    github: miry/marten
  pg:
    github: will/crystal-pg
    commit: cafe112f2847f366262460ee999e74f9c7e8b31c # unreleased 0.29.0

and run update command:

$ shards update marten
Resolving dependencies
Fetching https://github.com/miry/marten.git
Fetching https://github.com/will/crystal-pg.git
Fetching https://github.com/crystal-lang/crystal-db.git
Fetching https://github.com/crystal-i18n/i18n.git
Fetching https://github.com/crystal-community/msgpack-crystal.git
Using db (0.13.1)
Using i18n (0.2.2)
Using msgpack (1.3.4)
Installing marten (0.5.6 at 469058f)
Using pg (0.29.0 at cafe112)
Postinstall of marten: scripts/precompile_marten_cli
Writing shard.lock
  1. Verify the package pg version and state:
$ head -n 5 lib/pg/CHANGELOG
v? upcoming
=====================

v0.29.0    2024-11-05
=====================

$ cat shard.lock | grep pg -A 2
  pg:
    git: https://github.com/will/crystal-pg.git
    version: 0.29.0+git.commit.cafe112f2847f366262460ee999e74f9c7e8b31c

As you see the lib/pg/CHANGELOG was changed, but version in the lock file still the same.

miry avatar May 19 '25 08:05 miry

I think having both branch and commit is the issue. On update the resolver might only consider the branch ref, and overlook the commit ref.

ysbaddaden avatar May 19 '25 11:05 ysbaddaden

Yeah, I don't think the development_dependencies of a dependency should have any effect. The dependency resolver doesn't even process this information.

So it's likely a race between branch and commit.

straight-shoota avatar May 19 '25 12:05 straight-shoota

I think having both branch and commit is the issue. On update the resolver might only consider the branch ref, and overlook the commit ref.

I had the same thought and tested it using just commit or branch. I'll update the description to avoid confusion.

While working on logging, I noticed there's logic that checks if a library/package is already installed and skips further action if it is. It seems like when I run shards update marten, it either ignores the spec for current project pg or processes it after resolving marten -> pg. Interesting that root shard.lock file is correct, but lib/.shards.info is exactly the same as installed in lib/pg.

At the same time, I created some draft tests with the similar fake repos and dependencies in shards — and they worked fine :)

I'll take a closer look later to confirm whether it's an issue specific to my environment.

miry avatar May 19 '25 12:05 miry

Here are my small observations around the branch vs. commit directive usage:

shards.lock looks like:

  pg:
    git: https://github.com/will/crystal-pg.git
    version: 0.29.0+git.commit.cafe112f2847f366262460ee999e74f9c7e8b31c
  1. Using both branch and commit at the same time or only single branch: master
dependencies:
  marten:
    github: miry/marten
    branch: master

  pg:
    github: will/crystal-pg
    branch: master
    commit: cafe112f2847f366262460ee999e74f9c7e8b31c   # unreleased 0.29.0
$ ~/src/crystal/shards/shard update marten | grep pg
D: [pg] git ls-remote --get-url origin
I: Fetching https://github.com/will/crystal-pg.git
D: [pg] git config --get remote.origin.mirror
D: [pg] git fetch --all --quiet
D: [pg] git ls-tree -r --full-tree --name-only cafe112f2847f366262460ee999e74f9c7e8b31c -- shard.yml
D: [pg] git show cafe112f2847f366262460ee999e74f9c7e8b31c:shard.yml
D: [pg] git log -n 1 --pretty=%H refs/heads/master
D: [pg] git ls-tree -r --full-tree --name-only refs/heads/master -- shard.yml
D: [pg] git show refs/heads/master:shard.yml
D: [pg] git ls-tree -r --full-tree --name-only cafe112f2847f366262460ee999e74f9c7e8b31c -- shard.yml
D: [pg] git show cafe112f2847f366262460ee999e74f9c7e8b31c:shard.yml
I: Using pg (0.29.0 at cafe112)

I see that during the upgrade it validates 3 times pg. 2 of them are correct versions cafe112f2847f366262460ee999e74f9c7e8b31c and only second time it uses: refs/heads/master.

I see that during the upgrade, pg is fetched three times. Two of them are for the correct version cafe112f2847f366262460ee999e74f9c7e8b31c, and only the second time it uses refs/heads/master.

  1. Using only the commit directive:
dependencies:
  marten:
    github: miry/marten
    branch: master

  pg:
    github: will/crystal-pg
    commit: cafe112f2847f366262460ee999e74f9c7e8b31c   # unreleased 0.29.0
$ ~/src/crystal/shards/shard update marten | grep pg
D: [pg] git ls-remote --get-url origin
I: Fetching https://github.com/will/crystal-pg.git
D: [pg] git config --get remote.origin.mirror
D: [pg] git fetch --all --quiet
D: [pg] git ls-tree -r --full-tree --name-only cafe112f2847f366262460ee999e74f9c7e8b31c -- shard.yml
D: [pg] git show cafe112f2847f366262460ee999e74f9c7e8b31c:shard.yml
D: [pg] git log -n 1 --pretty=%H cafe112f2847f366262460ee999e74f9c7e8b31c
D: [pg] git ls-tree -r --full-tree --name-only cafe112f2847f366262460ee999e74f9c7e8b31c -- shard.yml
D: [pg] git show cafe112f2847f366262460ee999e74f9c7e8b31c:shard.yml
D: [pg] git ls-tree -r --full-tree --name-only cafe112f2847f366262460ee999e74f9c7e8b31c -- shard.yml
D: [pg] git show cafe112f2847f366262460ee999e74f9c7e8b31c:shard.yml
I: Using pg (0.29.0 at cafe112)

I see that during the upgrade, pg is validates three times — all referencing the correct commit cafe112f2847f366262460ee999e74f9c7e8b31c. However, the contents of lib/pg are from the 0.29.0 version, not the commit.

diff --git a/lib/.shards.info b/lib/.shards.info
index 9d0bee9..49641e4 100644
--- a/lib/.shards.info
+++ b/lib/.shards.info
@@ -18,10 +18,10 @@ shards:
     version: 1.3.4
   marten:
     git: https://github.com/miry/marten.git
-    version: 0.5.6+git.commit.a5aa7b3c27756d78ad87e5d908935aed9992cec9
+    version: 0.5.6+git.commit.469058f5e5e158fc367de3b2b9439478ef408776
   pg:
     git: https://github.com/will/crystal-pg.git
-    version: 0.29.0+git.commit.cafe112f2847f366262460ee999e74f9c7e8b31c
+    version: 0.29.0
   openssl_ext:
     git: https://github.com/spider-gazelle/openssl_ext.git
     version: 2.4.4

miry avatar May 19 '25 17:05 miry

I have figured out the problem. After I cleaned the cache folder, I could no longer reproduce the issue.

$ rm -fr ~/.cache/crystal/
$ rm -fr ~/.cache/shards/

miry avatar May 19 '25 18:05 miry

Sadly, this still doesn't explain how the shards cache ended up in a state where this would happen. 😒

straight-shoota avatar May 19 '25 19:05 straight-shoota

The way how I reproduced the problem.

  1. Clone the https://github.com/martenframework/marten/
  2. Run the shards from the marten:
     $ shards
     Resolving dependencies
     Fetching https://github.com/crystal-lang/crystal-mysql.git
     Fetching https://github.com/crystal-lang/crystal-sqlite3.git
     Fetching https://github.com/crystal-community/timecop.cr.git
     Fetching https://github.com/will/crystal-pg.git
     Fetching https://github.com/crystal-ameba/ameba.git
     Fetching https://github.com/crystal-community/msgpack-crystal.git
     Fetching https://github.com/crystal-lang/crystal-db.git
     Fetching https://github.com/crystal-i18n/i18n.git
     Fetching https://github.com/naqvis/crystal-vips.git
     Using i18n (0.2.2)
     Using msgpack (1.3.4)
     Using ameba (1.7.0-dev at 65f7db0)
     Using db (0.13.1)
     Using mysql (0.16.0)
     Using pg (0.29.0)
     Using sqlite3 (0.21.0)
     Using timecop (0.5.0)
     Using vips (0.1.6)
    
    It installed the latest stable pg as expected.
  3. Change the folder to the working project with the pinned commit.
  4. Update only marten in that project:
    $ shards update marten
    $ git status
    geändert:       lib/.shards.info
    geändert:       lib/marten/.github/workflows/qa.yml
    geändert:       lib/marten/.github/workflows/specs.yml
    ...
    geändert:       lib/pg/CHANGELOG
    geändert:       lib/pg/CONTRIBUTORS
    ...
    
    The pg version in lib folder reverted to the from pinned to the stable.

Now I can continue debuging the problematic place.

miry avatar May 20 '25 05:05 miry

My findings so far:

The package in lib/pg is refreshed during the marten Postinstall step.

The shard.yml of Marten looks like this:

scripts:
  postinstall: scripts/precompile_marten_cli

The script looks like:

$ cat scripts/precompile_marten_cli
#!/bin/sh -e

if [ -z "$SKIP_MARTEN_CLI_PRECOMPILATION" ]; then
  shards build
  mkdir -p ../../bin
  cp -r "$PWD/bin/marten" "$PWD/../../bin/"
fi

At this step, the lib folder inside lib/marten is already linked to the top-level lib directory.

 $ ~/src/crystal/shards/shard update marten --no-postinstal --verbose | grep marten
...
D: [marten] git show 469058f5e5e158fc367de3b2b9439478ef408776:shard.yml
I: Installing marten (0.5.6 at 469058f)
D: [marten] rm -rf /Users/miry/src/volnov/drar-api/lib/marten
D: [marten] git --work-tree=/Users/miry/src/volnov/drar-api/lib/marten checkout 469058f5e5e158fc367de3b2b9439478ef408776 -- .
D: [marten] Link /Users/miry/src/volnov/drar-api/lib to /Users/miry/src/volnov/drar-api/lib/marten/lib
I: Postinstall of marten: scripts/precompile_marten_cli

miry avatar May 20 '25 06:05 miry

Thanks for keeping digging. A weird postinstall can mess up dependency resolution 🤦 Looks like another thing to add to Shards’ postinstall considered harmful

straight-shoota avatar May 20 '25 07:05 straight-shoota

I have replaced shard build with crystal build to avoid a mess with the lib folder.

https://github.com/miry/marten/commit/8e3c9a4991ca52f4b57fcd92c0c382f0811cfde3

I dunno what could be improved or done to prevent similar issues in the future in context of shards.

miry avatar May 20 '25 07:05 miry

shards could set an environment variable which would allow nested processes to realize they're executing inside the context of another run. This could either trigger an error or adopt the outer process' configuration.

straight-shoota avatar May 20 '25 07:05 straight-shoota

Ooh, that's... ugly 🙈

Indeed, a SHARDS_POSTINSTALL or SHARDS_DEPENDENCY_CHECK=skip envvar could tell a nested shards build to skip dependency checks (they must have been installed) and immediately build.

ysbaddaden avatar May 20 '25 13:05 ysbaddaden

Would it be possible to share the shard.lock file with dependencies in the same way as the lib folder?

miry avatar May 21 '25 05:05 miry

Sounds interesting, but that won't prevent shards calls for an installed shard to mess with the shard.lock and the main lib folder. Unless we generate a specific shard.lock for each shard?

~~I'm wondering if we really should install the shard.yml and link the main lib folder when installing a shard. We could tweak the CRYSTAL_LIB instead when calling the postinstall script 🤔~~ Oh, we can't shards build without a shard.yml.

ysbaddaden avatar May 22 '25 14:05 ysbaddaden

Indeed, a SHARDS_POSTINSTALL or SHARDS_DEPENDENCY_CHECK=skip envvar could tell a nested shards build to skip dependency checks (they must have been installed) and immediately build.

I really like this solution. By the time any shard is being installed during a shards install flow, all of its dependencies have already been installed. If this is invariant, a child shards invocation has no real need to install anything, right?


Also, I think this issue and https://github.com/crystal-lang/shards/issues/398#issuecomment-1027529954 may be the same.

jgaskins avatar Oct 27 '25 04:10 jgaskins