pub icon indicating copy to clipboard operation
pub copied to clipboard

Add a shallow-clone option for git packages

Open OgieBen opened this issue 3 years ago • 10 comments

At my company we are currently combining unity with flutter through https://pub.dev/packages/flutter_unity_widget and we host the unity widget dependency on a git repo. It gets big really quickly and we have to keep deleting tags.

I would like to submit a PR that allows a user to notify Pub to make a shallow clone of a Git repository instead of a mirror clone of the remote repository.

For example a user could use the following command:

dart pub add http --git-url=https://github.com/my/http.git --git-ref=tmpfixes --git-shallow-clone=true

or inside the pubspec.yaml file

dependencies:
 vm_service:
   git:
     url: https://dart.googlesource.com/sdk
     ref: refs/changes/80/156980/3
     path: pkg/vm_service
     shallow-clone: true

There is also the possibility of specifying the depth of the shallow clone like so, instead of passing a boolean flag:

dart pub add http --git-url=https://github.com/my/http.git --git-ref=tmpfixes --git-shallow-clone=1

or inside the pubspec.yaml file

dependencies:
 vm_service:
   git:
     url: https://dart.googlesource.com/sdk
     ref: refs/changes/80/156980/3
     path: pkg/vm_service
     shallow-clone: 1

There is a similar issue here: https://github.com/dart-lang/pub/issues/2686.

OgieBen avatar Sep 06 '22 16:09 OgieBen

This is a private package right? And the issue is that it's large, thus, the git-dependency with full clone takes up a lot of space and bandwidth.

So is to correct that the possible solutions might be:

  • Use a private package repository? (Probably, less preferable because you can't piggy back off the authentication you already have for git)
  • git LFS (maybe?), if we tweaked pub to allow it?
  • shallow git clones?

I'm curious, if you have multiple Dart SDKs installed. How shallow clones affect the PUB_CACHE and how will an old Dart SDK interact with it? (There is possible a solution, just saying we need to figure this out)

Also how do git shallow clones actually work? How shallow are they? What does the depth mean, and when is that sensible? Are they supported by all git versions, or will we need feature detection?

Should we migrate to only use shallow clones? Or is full clones still sensible in some scenarios.

Sorry, for the dumb questions, I'm not fully versed in all details of modern git. And anything that changes layout in PUB_CACHE requires care to ensure it works when users upgrade/downgrade SDKs.

jonasfj avatar Sep 08 '22 22:09 jonasfj

This is a private package right? And the issue is that it's large, thus, the git-dependency with full clone takes up a lot of space and bandwidth.

So is to correct that the possible solutions might be:

  • Use a private package repository? (Probably, less preferable because you can't piggy back off the authentication you already have for git)
  • git LFS (maybe?), if we tweaked pub to allow it?
  • shallow git clones?

I'm curious, if you have multiple Dart SDKs installed. How shallow clones affect the PUB_CACHE and how will an old Dart SDK interact with it? (There is possible a solution, just saying we need to figure this out)

Also how do git shallow clones actually work? How shallow are they? What does the depth mean, and when is that sensible? Are they supported by all git versions, or will we need feature detection?

Should we migrate to only use shallow clones? Or is full clones still sensible in some scenarios.

Sorry, for the dumb questions, I'm not fully versed in all details of modern git. And anything that changes layout in PUB_CACHE requires care to ensure it works when users upgrade/downgrade SDKs.

Hi @jonasfj , I think most of your questions are valid.

I am not sure the following options you suggested below will resolve the issue because we will still need to pull a large history of our repository.

Use a private package repository? (Probably, less preferable because you can't piggy back off the authentication you already have for git) git LFS (maybe?), if we tweaked pub to allow it?

About this question:

I'm curious, if you have multiple Dart SDKs installed. How shallow clones affect the PUB_CACHE and how will an old Dart SDK interact with it? (There is possible a solution, just saying we need to figure this out)

The project using an older version of Dart will use the same version of the package cached in PUB_CACHE. The only difference between the mirror cloned and shallow cloned version is that the shallow cloned package will have a small history or commits than the mirror clone.

This how the depth option works:

--depth Create a shallow clone with a history truncated to the specified number of commits. Implies --single-branch unless --no-single-branch is given to fetch the histories near the tips of all branches. If you want to clone submodules shallowly, also pass --shallow-submodules.

It basically allow us to pull a specific number of commit instead of fetching the entire git repository history.

Should we migrate to only use shallow clones? Or is full clones still sensible in some scenarios.

Basically the idea is to make a mirror clone when the shallow-clone option is not provided i.e we make mirror clone the default strategy for fetching git packages and only make a git shallow clone when the shallow-clone option is provided.

OgieBen avatar Sep 09 '22 07:09 OgieBen

Basically the idea is to make a mirror clone when the shallow-clone option is not provided i.e we make mirror clone the default strategy for fetching git packages and only make a git shallow clone when the shallow-clone option is provided.

I get that, my question is if it's better to always make a shallow clone.

jonasfj avatar Sep 13 '22 08:09 jonasfj

Use a private package repository?

Would certainly alleviate concerns about having a huge git history.

jonasfj avatar Sep 13 '22 08:09 jonasfj

Basically the idea is to make a mirror clone when the shallow-clone option is not provided i.e we make mirror clone the default strategy for fetching git packages and only make a git shallow clone when the shallow-clone option is provided.

I get that, my question is if it's better to always make a shallow clone.

I am not sure if it is best to always make a shallow clone but I think it will be good to have an option to make a shallow clone when making a mirror clone becomes infeasible.

OgieBen avatar Sep 13 '22 08:09 OgieBen

Any update on this?

2shrestha22 avatar May 06 '23 12:05 2shrestha22

Reading this: https://github.blog/2020-12-21-get-up-to-speed-with-partial-clone-and-shallow-clone/ made me think that partial blob-less clones or maybe even partial tree-less clones might work well for pub. That would save a lot of bandwidth, while working well with how eg. github is serving repos.

I guess there is still a lot of questions to answer before attempting this.

  • Can we git fetch in a tree-less fashion?
  • Will this interact well with existing pub caches with full checkouts
  • Is this too breaking to do always (now you can no longer rely on the past history of your dependencies being available offline)
  • Are there any other unintended side-effects?

sigurdm avatar Jun 09 '23 12:06 sigurdm