peru icon indicating copy to clipboard operation
peru copied to clipboard

SemVer support

Open Hologos opened this issue 5 years ago • 27 comments

Hi,

I just found peru and have played with that a little. I would really love to deploy this to our company but I am missing one key feature. We need the ability to lock modules to a specific version (tag preferably) or better, to be able to specify a constrain for version.

Example

imports:
    rack_example: rack/  # This is where we want peru to put the module.

git module rack_example:
    url: git://github.com/chneukirchen/rack.git
    version: 1.3

This would fetch only module with version 1.3.*. We need this because our imaginary project is compatible with rack 1.3, not with 1.4 but we still want patches for 1.3. If I peru reup, it downloads the most recent version, which is 1.4.1.

I know I can write my own plugin (and I probably will for tarballs), but I would really like to use builtin plugins so I wouldn't have to deploy all plugins to all new employees that join our company.

The best option would be to implement something like composer has (and other dependency managers).

Hologos avatar Oct 02 '18 15:10 Hologos

If the versions are available as git branches or tags, peru already supports fetching them. For example, this is valid:

git module rack:
    url: https://github.com/chneukirchen/rack
    rev: 1.4.1

In general, whatever you put in rev gets passed to the git rev-parse command under the covers. One thing to note is that, by default, peru reup will still replace that rev field with the master commit hash. You can run peru reup [some other module] to avoid this, or you can also do something like the following to tell reup to target something other than master for this module:

git module rack:
    url: https://github.com/chneukirchen/rack
    reup: 1.4.1
    rev: c78d30ae39b437420cc94b62002ccc1ac0337cc7

Since you mentioned tarballs, did you catch the unpack: tar example in the readme? It's might be that peru already supports what you need there also, via the curl plugin. (Which is not in fact implemented with command line curl, but rather Python's standard HTTP libraries.)

oconnor663 avatar Oct 02 '18 19:10 oconnor663

If the versions are available as git branches or tags, peru already supports fetching them. For example, this is valid:

git module rack:
   url: https://github.com/chneukirchen/rack
   rev: 1.4.1

The problem is, it will only fetch 1.4.1. What I need is when I speficy version: 1.4, it fetches 1.4.* only, never 1.5 and higher.

Hologos avatar Oct 03 '18 04:10 Hologos

Yeah, peru (like git) has no notion of semver, or any other way that tags might be related. I could imagine implementing some kind of feature like "extract the latest tag name", but I think that would be error-prone. For example, how does it know to avoid a tag like 2.0-rc1? If we build in semver support, should it work with people who put a "v" at the front of their tag names? I worry that git repo conventions aren't uniform enough for something like this to work consistently.

I think if you've got a project with dependencies that are all in the same package format, like Ruby gems or whatever, it's probably going to be better to use a language-specific tool that understands their versions and their dependency relationships. But tell me more about the problems you're trying to solve.

oconnor663 avatar Oct 03 '18 16:10 oconnor663

I worry that git repo conventions aren't uniform enough for something like this to work consistently.

The common convention is to name tags as vX.Y.Z.... Github itself recommends this when creating a release.

For example, how does it know to avoid a tag like 2.0-rc1?

Composer for example parses a tag name and if it finds vX.Y-something, it treats it as an unstable release. You have to explicitly say that you want an unstable releases to include tags like this.

But tell me more about the problems you're trying to solve.

I am looking for a language agnostic dependency manager. It has to support tarballs and git (at least) and version constraints. The reason is that in our company we use a lot of shell, perl and python script to manage our servers. We wrote a lot of libraries and use them in a lot of scripts. We are now constantly fighting to keep the libraries up-to-date inside the projects (up-to-date = only versions that don't contain BC break).

Peru would be a great candidate but it has to support this. I've been looking into plugins and I will have to probably extend the git and curl plugins to support this. The downside is as I stated - I will have to deploy the modified plugins in our company.

Hologos avatar Oct 03 '18 16:10 Hologos

Despite (or maybe because of) working on the peru project for so long, I still advise people to consider unifying their repos, especially if they're effectively a single organization working on a lot of related projects. A mono-repo approach makes a lot of these problems disappear. (Though I understand some of the limitations, especially around access permissions and things like that.)

If you do want to stick with peru, and you do control all of the repos in question, but you don't want to generally reup from master, another approach you could consider would be to have each of your projects define release branches for the versions they support. For example, a repo with tags at v1.0, v1.1, and v1.2 might also maintain a v1 branch that moves forward as each of those tags are added. Then if the project tags v2.0, it could create a v2 branch and leave the v1 branch pointing to v1.2.

The reason I suggest that is that it lets the repo define its own conventions, so that tooling that already works with git branches doesn't need to know anything more. If you want to define conventions for peru to interpret branch names itself, I worry that it opens up a can of worms. Here are a couple of thoughts.

  • Not all projects follow SemVer. Ignoring simple issues like whether they put a "v" in front of things, some projects (like the Linux kernel) do major version bumps without breaking backwards compatibility. Would a git-tag-semver approach require some sort of escape hatch? How complicated would that escape need to be? For example if we're fetching the Linux repo, presumably it's not ok to just fetch from master. How do we tell peru which tags constitute stable versions?
  • There seems to be some disagreement Out There about what SemVer means prior to 1.0. The spec itself officially makes no guarantees. A lot of projects (like the Rust package manager, Cargo) follow the "minor version bumps are incompatible, patch version bumps are compatible" convention. Should peru follow that convention?
  • On the peru-implementation-details side of things, the core peru codebase actually doesn't know anything about the rev or reup fields of the different repo types. Those are purely plugin implementation details. Would that mean that each plugin would need its own implementation of the SemVer logic? So far plugins have remained simple enough that they actually don't need to share any code with each other, so this could be a bit of a departure.

All of these questions are potentially answerable, but they make me feel like baking this sort of feature into peru would be a big commitment, especially if solutions that don't require special support are also viable.

oconnor663 avatar Oct 04 '18 19:10 oconnor663

Did you check how composer implements this? It doesn't have solve any of the problems you describe, it's all up to the user to specify things.

Hologos avatar Oct 04 '18 21:10 Hologos

What I need is when I specify version: 1.4, it fetches 1.4.* only, never 1.5 and higher.

Ditto

naturallymitchell avatar Oct 11 '18 17:10 naturallymitchell

Looks like semver can do this.

naturallymitchell avatar Oct 11 '18 19:10 naturallymitchell

A mono-repo model or unified repositories seem preferable, but that's not always possible and I can see the appeal of using SemVer here. I don't think peru should ever attempt to understand SemVer and this should probably be supported within plugins if at all. Assuming we want to support this in some way, I have the following questions and thoughts.

  1. Should peru ship plugins that support SemVer or should these plugins be provided elsewhere?
  2. If peru ships these plugins, should SemVer support be baked into existing plugins like git or should new plugins designed for SemVer be created (e.g., git-semver)?
  3. How do we distribute these plugins if they do not ship with the base peru installation?
  4. How does configuration work? Plugins could expose fields for this, but which ones? What if there are many similar modules within a peru project?

At first glance, I think the answer to [1] and [2] is that these should be separate plugins provided by the peru installation.

The configuration required for SemVer could be complex and would likely bloat the plugins we already ship today, creating unnecessary complexity and the possibility for bugs for most users. It could even create incompatibilities with existing fields. I think the semantics are different enough to warrant separate plugins.

Shipping them with peru as separate plugins should be relatively harmless for users that don't need this functionality while eliminating difficulty with distribution for the users that do (we can effectively ignore [3]). However, we may want to get a better idea of just how often this is needed. If we think this is a bit too niche, then perhaps a separate source for these plugins could make more sense.

Answering [4] may be tricky, as there are a lot of moving parts to consider. Looking to other projects that support some notion of SemVer is a good start. As already mentioned, it should be possible to support different encodings in tags, for example.

olson-sean-k avatar Oct 11 '18 20:10 olson-sean-k

Rust is a good example. Its ecosystem generally uses semver, and the Cargo package manager handles everything very well. I think here's a basic idea, possibly https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#caret-requirements

naturallymitchell avatar Oct 12 '18 00:10 naturallymitchell

I'm getting more interested in this as I think about it over time. I'm busy with other projects right now, but I'd be happy to mentor anyone who wants to try to tackle this. Some scattered thoughts:

  • ~Note that https://github.com/buildinspace/peru/pull/194 is going to be a big refactor, where we move from the old Python-3.3-compatible syntax to the modern async/await syntax. If anyone's going to start work on a big PR, it should probably be based on that to avoid conflicts. Actually if anyone's going to start work on a big PR, I should just re-review that one and land it.~ Edit: #194 has landed, so no more need to worry about this.

  • I can think of at least two different architectural approaches to this. One would be to have the git plugin accept a semver: 1.6 flag, and to just implement this whole thing itself inside of its part of the reup command. Another would be to define a new plugin capability called versions or something, where the plugin reports all the available tags/branches/whatever, and the peru core takes care of figuring out which one of those tags is the latest compatible one. I lean towards the latter, since it avoids duplicating some potentially complicated version parsing / constraint satisfying logic across many plugins. But the latter will also require deeper changes to peru than the former would.

oconnor663 avatar Nov 22 '18 16:11 oconnor663

@olson-sean-k

I don't think peru should ever attempt to understand SemVer and this should probably be supported within plugins if at all.

I think complete opposite. I don't see the reason to program it to every plugin since the logic will be the same. I mean the calculation if version 1.3 passes the rule >=1.2 <2.0.0 will be the same for git plugin, for tar plugin, etc.

Honestly, I can't see anyone using any dependency manager without support for semver. Maybe for some small projects (scripts maybe). Once you start using this for some project with frameworks that introduces BC breaks and this project is running in production, you will miss semver support.

Hologos avatar Jan 02 '19 14:01 Hologos

I don't see the reason to program it to every plugin since the logic will be the same.

Yeah I agree with this. It probably makes more sense to do it in the core. That means we'll need to design some kind of extension to the plugin protocol described in architecture.md.

There's a lot to think about in terms of how the reup command should interact with version constraints. Probably the right model is "update the commit hash to the latest compatible version (tag)." Projects will probably have two big options for doing SemVer:

  1. Use versioning as a way of controlling how peru reup updates their dependencies. In this case, the programmer would still explicitly ask for updates.
  2. Get rid of fixed versioning entirely, and rely on the semver logic to always fetch the latest compatible version. In this case, updates come automatically without any downstream intervention at all.

The first approach will be pretty consistent with what the peru best practices have been so far. But the second approach will be pretty different, and I expect various issues will come up if we try it out. The first one I can think of, is that peru caching functions very similarly to the Cargo lockfile model, except that it's more hidden from the programmer. It might be that the current caching rules are too aggressive or something like that. We'll have to see.

oconnor663 avatar Jan 02 '19 15:01 oconnor663

The first approach will be pretty consistent with what the peru best practices have been so far. But the second approach will be pretty different, and I expect various issues will come up if we try it out

It's gonna be better to implement it as described in 1). You may change it to default in some future release if people use mainly this option.

Hologos avatar Jan 24 '19 18:01 Hologos

@Hologos

I think complete opposite. I don't see the reason to program it to every plugin since the logic will be the same.

The more I think about this, the more I agree that SemVer support will require some centralized logic. This is especially true when considering reup. The last thing users will want to deal with is subtly different SemVer logic in disparate plugin implementations. Similarly, we shouldn't duplicate logic in a bunch of plugins.

I like to leverage the flexibility of the plugin system when possible, but this is probably not a good case for that! I hadn't really considered that this could easily expand beyond a niche git plugin. SemVer is definitely a core feature, not a one-off plugin.

@oconnor663

There's a lot to think about in terms of how the reup command should interact with version constraints.

I think option 1 would be the best thing to support and suggest to users. I'm not convinced that SemVer should completely subsume fixed versioning.

olson-sean-k avatar Jan 24 '19 18:01 olson-sean-k

I think a good design will be able to support both use cases at the same time. On the one hand you have projects that commit their rev fields. There, a semver field could control how peru reup updates the rev, but the rev would remain the sole source of truth for peru sync. On the other hand you have projects that omit rev because they want "always up-to-date dependencies/resources without any maintenance steps at all" or something like that. In that case, a semver field could affect the revision chosen by peru sync.

oconnor663 avatar Jan 24 '19 20:01 oconnor663

That would be great. I know you guys are busy but do you have any idea when this could be implemented? Not a specific date but an estimation such as 6 month from now, a year?

Thank you

Hologos avatar Jan 24 '19 20:01 Hologos

I'm afraid I don't. I have another project taking up the majority of my time right now, and I think after that ends it's likely I'll have other commitments. The best chance for this feature will be if someone else is willing to take point on it. I'd be happy to be a mentor for it though.

oconnor663 avatar Jan 24 '19 20:01 oconnor663

I've spent several days looking for a language agnostic dependency manager but apart from peru I haven't found anything (maintained and feature rich as peru) so I'd like to try to implement this. I don't have much experience with python on this scale and I will need some guidance. Would you be willing to comment on my future PR @oconnor663 ?

Hologos avatar May 13 '19 10:05 Hologos

Absolutely. And let me know if you'd like to do a video chat or something like that beforehand to talk about the design. You can reach me at the same username at Gmail or Keybase.

oconnor663 avatar May 14 '19 05:05 oconnor663

4 years later, any progress on this? @Hologos @oconnor663

I would start working on it myself, but sadly I never learned python

I guess I could learn python, but is this feature even realistic? Can any one that tried implementing it give some feedback?

Araxeus avatar Feb 16 '23 11:02 Araxeus

In general, whatever you put in rev gets passed to the git rev-parse command under the covers. One thing to note is that, by default, peru reup will still replace that rev field with the master commit hash. You can run peru reup [some other module] to avoid this, or you can also do something like the following to tell reup to target something other than master for this module:

Maybe instead of full semver support:

If rev is a version, on reup instead of replacing it with the latest commit hash, replace it with the latest git tag

This would solve alooot of problems! You could specify a version and it gets updated automatically like other package managers... Actual semver support can be added later

Araxeus avatar Feb 16 '23 12:02 Araxeus

4 years later, any progress on this? @Hologos @oconnor663

I would start working on it myself, but sadly I never learned python

I guess I could learn python, but is this feature even realistic? Can any one that tried implementing it give some feedback?

I hadn't had time to implement it when I needed it and moved to other projects where peru wasn't needed anymore.

Hologos avatar Feb 16 '23 14:02 Hologos

Thank you Hologos, i've opened another issue which should be much easier to implement (#233)

Araxeus avatar Feb 16 '23 15:02 Araxeus

@Hologos out of curiosity – what alternatives to peru are you using?

Araxeus avatar Feb 17 '23 14:02 Araxeus

@Hologos out of curiosity – what alternatives to peru are you using?

Every language has its own dependency manager, for python it's pypi, for php it's composer, for java it's either maven or gradle, etc.

Hologos avatar Feb 17 '23 14:02 Hologos

I'm using peru to download specific vendor files from a repo, here's my use case: https://github.com/Araxeus/Youtube-Volume-Scroll/pull/34/files#diff-d24f12e22334138ae5c51d0d1b1791c3d46577c15ab1b71f2b6777e0d79a415b

NPM, for example – doesn't allow downloading a specific file and choosing the output location :) I could always make custom scripts to automate all that, but I prefer having a more general solution

Araxeus avatar Feb 17 '23 14:02 Araxeus