rascal icon indicating copy to clipboard operation
rascal copied to clipboard

New `mvn` scheme to replace `lib` scheme.

Open jurgenvinju opened this issue 1 year ago • 16 comments

The current plan is to slowly move away from the lib:// scheme with the following idea:

  • RASCAL.MF will only name the library dependencies by string name, not by URI
  • Or/And: Require-Libraries will dissappear completely as feature
  • Replaced by the dependencies of pom.xml
  • And a new library scheme will unambiguously define the location of a mvn jar like so: mvn://package-id/project-id#VERSION-CONSTRAINT
  • The project configuration code that produces pathConfig instances will create the mvn URIs.
  • The mvn URI resolver will look in the .m2 cache and otherwise try to use mvn to download the artefacts, or otherwise throw IO exceptions.
  • Even though mvn URIs are also not universal (dependent on the repositories configuration in the local pom.xml), at least they are precise and they are deterministic. As long as projects do not move between different repo's they are reasonable.

That's the plan.

Originally posted by @jurgenvinju in https://github.com/usethesource/rascal-language-servers/issues/110#issuecomment-1934021438

jurgenvinju avatar Feb 08 '24 12:02 jurgenvinju

https://github.com/cmarchand/maven-catalogBuilder-plugin/wiki

jurgenvinju avatar Feb 08 '24 12:02 jurgenvinju

Can we merge this with this discssion? https://github.com/usethesource/rascal/issues/1886

DavyLandman avatar Feb 08 '24 13:02 DavyLandman

merged here. We can start by implementing mvn:/// with an eye for efficiency. This is going to be an often-used scheme.

jurgenvinju avatar Feb 08 '24 16:02 jurgenvinju

I like the idea of this. my only addition is, maybe not use the fragment for the version?

DavyLandman avatar Feb 09 '24 10:02 DavyLandman

pip uses the @ sign for versions: pip install -U git+https://github.com/AdaCore/[email protected] Which is also sweet. But @ is not reserved as a separator in URI's like # is.

jurgenvinju avatar Feb 09 '24 10:02 jurgenvinju

image

That's the way it's done in the mvn community when they want to use URIs for packages

jurgenvinju avatar Feb 09 '24 10:02 jurgenvinju

image

That's the way it's done in the mvn community when they want to use URIs for packages

whats the source? I don't think that maven plugin you mentioned reflects the maven community.

DavyLandman avatar Feb 09 '24 10:02 DavyLandman

For example, gradle uses <groupId>:<artifactId>:<version> (like: io.micronaut.test:micronaut-test-spock:1.1.3)

so does maven itself as well: if you run mvn dependency:tree for example:

[INFO] --- dependency:3.6.0:tree (default-cli) @ nescio-core ---
[INFO] engineering.swat:nescio-core:jar:1.1.0-SNAPSHOT
[INFO] +- org.rascalmpl:rascal:jar:0.33.7:compile
[INFO] +- org.rascalmpl:rascal-lsp:jar:2.18.0:compile
[INFO] +- org.rascalmpl:typepal:jar:0.8.9:compile
[INFO] \- junit:junit:jar:4.13.1:test
[INFO]    \- org.hamcrest:hamcrest-core:jar:1.3:test

So that would mean either it's mvn://org.rascalmpl:typepal:0.8.9/ or it's mvn:///org.rascalmpl/typepal/0.8.9/ ?

DavyLandman avatar Feb 09 '24 10:02 DavyLandman

yes I saw that too, but printing is not enough. We have to be sure it also parses unambiguously. This depends on the restrictions we have on group id's and artifact id's, and that intersected with the constraints on authorities in URIs. If we are lucky we do not need any encodings; the group/artifact id's are then fully contained in the authority character class, and we have characters left outside of that intersection in authorities that we can use as separators.

Source material:

  • https://maven.apache.org/maven-conventions.html#:~:text=These%20identifiers%20should,the%20group%20ID.
  • https://maven.apache.org/guides/mini/guide-naming-conventions.html
  • https://docs.oracle.com/javase/specs/jls/se6/html/packages.html#7.7

I think that mvn://org.rascalmpl:typepal:0.8.9/ that you proposed is allowed according to that documentation!

jurgenvinju avatar Feb 09 '24 10:02 jurgenvinju

Definitely, I'd like the root of a file system to be encoded in the authority/host field of the URI. The files inside of the jar can go in the path then. Otherwise, the file system metaphor breaks inside Rascal (we don't know what a root is anymore).

jurgenvinju avatar Feb 09 '24 10:02 jurgenvinju

I'd also like to propose we do not support any shorthands, so no:

  • |mvn://project-id:version/
  • |mvn://group-id:project-id| etc.

The mvn scheme should unambiguously declare which package to use and which version, leading to an unambiguously identified jar file in the local repository.

jurgenvinju avatar Feb 09 '24 11:02 jurgenvinju

yes I saw that too, but printing is not enough. We have to be sure it also parses unambiguously.

For gradle their syntax is not printing, it's how you describe dependencies.

This depends on the restrictions we have on group id's and artifact id's, and that intersected with the constraints on authorities in URIs. If we are lucky we do not need any encodings; the group/artifact id's are then fully contained in the authority character class, and we have characters left outside of that intersection in authorities that we can use as separators.

I think the biggest problem is that : in authority has a special meaning. It's either for seperating user & password before the @ sign or specifiying the port. (see https://datatracker.ietf.org/doc/html/rfc3986#section-3.2 ) so for example: http://user:[email protected]:8080/

Definitely, I'd like the root of a file system to be encoded in the authority/host field of the URI. The files inside of the jar can go in the path then. Otherwise, the file system metaphor breaks inside Rascal (we don't know what a root is anymore).

Agreed, I prefer that as well.

The mvn scheme should unambiguously declare which package to use and which version, leading to an unambiguously identified jar file in the local repository.

Agree.

DavyLandman avatar Feb 09 '24 11:02 DavyLandman

section-2.3 says we only have these characters if we want to stay out of accidental other interpretations: unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"

And mvn says this: These identifiers should be comprised of lowercase letters, digits, and hyphens only

So the intersection is: [a-z0-9\-] and we have [\.\_~] as possible separators between project id, group id and version.

jurgenvinju avatar Feb 09 '24 11:02 jurgenvinju

What the URI standard does not say is that authorities in VScode will be normalized to lowercase characters anyway.

jurgenvinju avatar Feb 09 '24 11:02 jurgenvinju

but . are already used in maven ids, so we only have _ and ~ as possible separator chars?

DavyLandman avatar Feb 09 '24 12:02 DavyLandman

theoretically yes. so it seems we should use the path component / anyway? mvn:///group/project/version/root or ``mvn://group/project/version/root`?

jurgenvinju avatar Feb 09 '24 13:02 jurgenvinju